Is there a way to access the update values computed by the optimizer?
For example, if I’m using plain SGD with a learning rate a, then update values are simply a*gradients.
More concretely, these values should be what .backward() method computes, for each instance in the batch, and automatically adds to the model weights.
You can access the gradients by accessing the
.grad field of each parameters.
The update value though will depend on the optimizer and you won’t be able to get it for the complex ones. The simplest thing I can think of here would be to save the weights before and compare them with the new value after the update.
You understood me correctly. I’m not really interested in the grads, but what the optimizer computes when we call .step(). So, I guess you don’t see an efficient way of getting those values except than looking at the weights before, and after the update?
I’m afraid that in the general case no, there isn’t any other way to get that
As a fun fact, some optimizers like lbfgs might actually do the step as multiple “sub-steps”, so in that case, checking before and after is really the only thing you can do.