How to apply modified gradient to optimizer

I want to add some Gaussian noise to the gradient, so that the gradient applied is the sum of the gradient computed by the graph, and my Gaussian noise. I tried using register_hooks, as the test code below:

X = Variable(torch.FloatTensor([4]))
a = Variable(torch.FloatTensor([2]), requires_grad = True)
y = a * X
criterion = nn.MSELoss(size_average = True)
optimizer = torch.optim.Adam([a], lr = 1e-2)

h = a.register_hook(lambda grad: grad * 10)
optimizer.zero_grad()
loss = criterion(y, Variable(torch.FloatTensor([7])))
loss.backward()
optimizer.step()
print(a)
print(a.grad)

But I find that although I can modified the gradient at will, the optimizer always update the gradient using the original gradient, not the one after register_hook. Does anyone know how can we apply the modified gradient with the optimizer?

Thanks!

2 Likes

if you modify the gradient, you have to return it from the hook. The returned gradient is used further. If you dont return a modified gradient, then the original gradient is used.

Hi

What do you mean by return it from the hook?

lambda grad: return grad * 10

But the lambda automatically return the expression right? And the lambda cannot contain statements like return.

Try to use SGD. Adam has momemtum so you may need to run several iterations to see the difference