Finding gradient of a temporary variable without adding it to computational graph

curious · February 7, 2020, 4:48am

I am trying to find the value of loss for a temporary set of variables in a neural network so that I can have their gradients but I do not want these variables in the computational graph. I have a matrix of weights, I want to know their gradients but this matrix is not in the neural network’s model parameters.

albanD · February 7, 2020, 4:49pm

If these are used during the computation of the loss in a differentiable manner, you can get:

call .retain_grad() on these intermediary elements so that their .grad field will be populated when you call loss.backward().
Get the grad only for these with grads = autograd.grad(loss, interm_tensors).

curious · February 10, 2020, 10:22am

What should be the format of interm_tensors to do that?

albanD · February 10, 2020, 3:59pm

You can do that on any Tensor that has requires_grad=True. Note that this field is set automatically when a Tensor is computed in a differentiable manner.