In trying to implement a custom loss function would it better to:
- Create a python function that takes in a tensor and calculates loss
- Implement by inheriting from the nn.Module class
Also since an activation function does not have any parameters as it is not a ‘layer’ how can we make sure that the gradients are passed through if we use method 2.
I had tried both methods and found that method 2 prevented passing of gradients. But using method 1 there is no problem with backprop gradient passing.
Thanks in advance
Both approaches work fine.
nn.Module internally uses the
functional interface of PyTorch nn.
I’d say that it depends on the loss and the application, but usually having functions can be quicker to write and more concise.
Thank you for the quick reply.
While implementing using inheritance from nn.Module, for some reason the gradients weren’t flowing backwards. Is there any particular reason for this? Do we need to declare any variables in the constructor for it to work? (Since activations take input and pass it through non-linearity, I didn’t declare any variables in the constructor. I had only written the forward() function for it. Could this be a reason for no gradients being calculated?)