Hi,
I’m very new to Pytorch (and ML in general), so I’m having difficulty understanding what is going on WRT a custom loss/cost function I’m looking at. I understand what’s going on in the function, but I need to understand how the gradient output of the last network layer is calculated.
NOTE: My task it to implement this custom loss function into our bespoke C++ ML lib. But to do this, I need to “manually” calculate the gradients for the last layer of the network.
So, if I run my network:
results = my_network(inputs)
And then my loss function:
loss = my_loss_fn(inputs, results, targets)
loss.backward()
Finally, if I print out the grad_fn chain on “results”, I see:
grad_fn chain of "results":
SqueezeBackward1
DivBackward0
SliceBackward
SqueezeBackward1
Col2ImBackward
TransposeBackward0
MulBackward0
FftC2RBackward
ViewAsComplexBackward
TransposeBackward0
ViewBackward
LeakyReluBackward0 <=== This is the final layer of my_network
NativeBatchNormBackward
SlowConvTranspose2DBackward
So if I understand the autograd correctly, I would have to implement each of those grad_fn’s to arrive at the grad output for the LeakReluLayer?