Understanding custom loss function steps

stokecat · July 18, 2024, 8:33pm

Hi,

I’m very new to Pytorch (and ML in general), so I’m having difficulty understanding what is going on WRT a custom loss/cost function I’m looking at. I understand what’s going on in the function, but I need to understand how the gradient output of the last network layer is calculated.

NOTE: My task it to implement this custom loss function into our bespoke C++ ML lib. But to do this, I need to “manually” calculate the gradients for the last layer of the network.

So, if I run my network:

results = my_network(inputs)

And then my loss function:

loss = my_loss_fn(inputs, results, targets)
loss.backward()

Finally, if I print out the grad_fn chain on “results”, I see:

grad_fn chain of "results":
   SqueezeBackward1
   DivBackward0
   SliceBackward
   SqueezeBackward1
   Col2ImBackward
   TransposeBackward0
   MulBackward0
   FftC2RBackward
   ViewAsComplexBackward
   TransposeBackward0
   ViewBackward
   LeakyReluBackward0   <=== This is the final layer of my_network
   NativeBatchNormBackward
   SlowConvTranspose2DBackward

So if I understand the autograd correctly, I would have to implement each of those grad_fn’s to arrive at the grad output for the LeakReluLayer?

ptrblck · July 19, 2024, 2:28am

If I understand your use case correctly, you thus want to implement the backwardmethod of your custom function manually to port it to C++. If so, then yes, the SqueezeBackward to ViewBackward seem to correspond to the backward of your loss function.