Best practice for custom loss functions

AahanSingh · May 17, 2018, 9:55am

In trying to implement a custom loss function would it better to:

Create a python function that takes in a tensor and calculates loss

OR

Implement by inheriting from the nn.Module class

Also since an activation function does not have any parameters as it is not a ‘layer’ how can we make sure that the gradients are passed through if we use method 2.
I had tried both methods and found that method 2 prevented passing of gradients. But using method 1 there is no problem with backprop gradient passing.

Thanks in advance

fmassa · May 17, 2018, 11:38am

Both approaches work fine.

Actually, the nn.Module internally uses the functional interface of PyTorch nn.

I’d say that it depends on the loss and the application, but usually having functions can be quicker to write and more concise.

AahanSingh · May 19, 2018, 8:54am

Thank you for the quick reply.

While implementing using inheritance from nn.Module, for some reason the gradients weren’t flowing backwards. Is there any particular reason for this? Do we need to declare any variables in the constructor for it to work? (Since activations take input and pass it through non-linearity, I didn’t declare any variables in the constructor. I had only written the forward() function for it. Could this be a reason for no gradients being calculated?)