Calculating gradient for each layer without using loss.backward()

Hi all,

I am in a bit of a predicament here. So right now, I am combining a neural network implemented in PyTorch and a conditional random field (graphical model) which is implemented in C++. I am passing in the output of the 2nd last layer of the neural network to the CRF. The CRF then uses that layer to calculate an AUC for its performance and also calculates the gradient of the second last layer. I then receive the gradient that the CRF calculated and place it in the .grad attribute of the second last layer.

At this point, my problem arises. I want to start back-propagation from this layer to all other parameters in my network, but this is hard to figure out to do without calling loss.backward() since each parameter is its own leaf node in the computational graph. Does anyone know any legit ways or any “hacky” ways to solve this?


Why don’t you define a custom autograd function involving CRF?
This way typical pytorch workflow would keep working without “hacks”.

forgot to mention you can create c++ modules for pytorch

1 Like

So you’re saying I could create a custom autograd function and that call backwards on the 2nd last layer in PyTorch?

You said you are computing auc and backpropagating from there. As it is c++ you are pasting the gradient.

You can create a python wrapper which takes as input that layer and returns the auc as a tensor relying in that c++ implementation. This way you will be able to call auc.backward