Compute gradients of weights with respect to variable in loss function

Hello everyone,

I’m currently working with PyTorch and have a question about the optimizer.step() function and autograd. I understand that optimizer.step() performs a single optimization step (parameter update) based on the gradients that have already been computed and stored in the .grad attribute of the parameters during the backward pass. However I also need something different which seems to be just the opposite.

Here is my question:

I’m interested in understanding how a specific variable in my loss function influenced the weights during the update (optimizer.step). Is there a way to compute the gradients of the weights with respect to a variable in the loss function of an ANN?

To understand the background of why I need this, here is a detailed explanation:

I am building a combination of two ANNs. The 2nd ANN computes an output (psi) that is used inside the loss function of the 1st ANN. The loss function of the 2nd ANN is derived from the output of the 1st ANN. Currently, PyTorch is not able to compute the gradient of the loss function of the 2nd ANN with respect to its parameters because they do not influence the output of the 1st ANN (which is used inside the loss of the 2nd ANN) directly but only via a training step that updates the parameters of the 1st ANN depending on the output of the 2nd ANN.
Apparently PyTorch does not track the influence of the loss function on the weights during the optimizer.step() in the autograd graph.

Any insights or suggestions would be greatly appreciated. Thank you in advance!

Best, Jonathan

Apparently PyTorch does not track the influence of the loss function on the weights during the optimizer.step() in the autograd graph.

You can enable this tracking by specifying differentiable=True in SGD — PyTorch 2.1 documentation

1 Like

Thank you! I just found TorchOpt which also allows differentiation through the optimization path. But I will also try this.