Finding gradient of a temporary variable without adding it to computational graph

I am trying to find the value of loss for a temporary set of variables in a neural network so that I can have their gradients but I do not want these variables in the computational graph. I have a matrix of weights, I want to know their gradients but this matrix is not in the neural network’s model parameters.

If these are used during the computation of the loss in a differentiable manner, you can get:

  • call .retain_grad() on these intermediary elements so that their .grad field will be populated when you call loss.backward().
  • Get the grad only for these with grads = autograd.grad(loss, interm_tensors).

What should be the format of interm_tensors to do that?

You can do that on any Tensor that has requires_grad=True. Note that this field is set automatically when a Tensor is computed in a differentiable manner.