How to apply modified gradient to optimizer

tailintalent · September 22, 2017, 3:46pm

I want to add some Gaussian noise to the gradient, so that the gradient applied is the sum of the gradient computed by the graph, and my Gaussian noise. I tried using register_hooks, as the test code below:

X = Variable(torch.FloatTensor([4]))
a = Variable(torch.FloatTensor([2]), requires_grad = True)
y = a * X
criterion = nn.MSELoss(size_average = True)
optimizer = torch.optim.Adam([a], lr = 1e-2)

h = a.register_hook(lambda grad: grad * 10)
optimizer.zero_grad()
loss = criterion(y, Variable(torch.FloatTensor([7])))
loss.backward()
optimizer.step()
print(a)
print(a.grad)

But I find that although I can modified the gradient at will, the optimizer always update the gradient using the original gradient, not the one after register_hook. Does anyone know how can we apply the modified gradient with the optimizer?

Thanks!

smth · September 28, 2017, 3:30pm

if you modify the gradient, you have to return it from the hook. The returned gradient is used further. If you dont return a modified gradient, then the original gradient is used.

SKYHOWIE25 · January 29, 2018, 11:12am

Hi

What do you mean by return it from the hook?

smth · January 29, 2018, 3:40pm

lambda grad: return grad * 10

SKYHOWIE25 · January 30, 2018, 12:41am

But the lambda automatically return the expression right? And the lambda cannot contain statements like return.

XavierXiao · January 20, 2020, 4:22pm

Try to use SGD. Adam has momemtum so you may need to run several iterations to see the difference