I want to add some Gaussian noise to the gradient, so that the gradient applied is the sum of the gradient computed by the graph, and my Gaussian noise. I tried using register_hooks, as the test code below:

```
X = Variable(torch.FloatTensor([4]))
a = Variable(torch.FloatTensor([2]), requires_grad = True)
y = a * X
criterion = nn.MSELoss(size_average = True)
optimizer = torch.optim.Adam([a], lr = 1e-2)
h = a.register_hook(lambda grad: grad * 10)
optimizer.zero_grad()
loss = criterion(y, Variable(torch.FloatTensor([7])))
loss.backward()
optimizer.step()
print(a)
print(a.grad)
```

But I find that although I can modified the gradient at will, the optimizer always update the gradient using the original gradient, not the one after register_hook. Does anyone know how can we apply the modified gradient with the optimizer?

Thanks!