Proper way to do projected gradient descent with optimizer class

Hello.
I’m running gradient descent using pytorch ADAM optimizer. After each step I want to project the updated pytorch variable to [-1, 1]. How can I do it properly? Adding a line with torch.clamp after optimizer.step(), seems to stop optimizer updating its parameters at all (so I get no updates from my second call to optimizer.step() onwards), even when updating explicitely the parameter gradients

1 Like

You should only apply the projection on weight.data, so that the operation isn’t taken into account in the computation graph (and it’s gradient isn’t retained). See here.

1 Like