[Solved] What is the correct way to implement custom loss function?

(raff) #1

I tried to implement my own custom loss based on the tutorial in extending autograd.
Here is the implementation outline:

class MyCustomLoss(Function):

    def forward(self, input, target):
        ... # implementation
        return loss   # a single number (averaged loss over batch samples)

    def backward(self, grad_output):
        ... # implementation
       return grad_input

The forward function take an input from the previous layer and target which contains array of labels (categorical, possible value = {0,…,k-1}, k is the number of class).
In the backward function I write a gradient of the loss with respect to the input. When I run, I got an error says that it needs one more gradient. I assume that pytorch also require to also write the gradient of the loss with respect to the target, which in this case does not really make sense (target is a categorical variable), and we do not need that to backpropagate the gradient.

Here’s my code to run the implementation

inp = Variable(torch.randn(10,10).double(), requires_grad=True)
target = Variable(torch.randperm(10), requires_grad=False)
loss = MyCustomLoss()(inp, target)

And here is the error message I get:

RuntimeError: MyCustomLoss returned an invalid number of gradient tensors (expected 2, but got 1)

Is there anything that I missed? How to correctly implement a custom loss?

Thank you.

(Spruce Bondera) #2

Return None for the gradient of values that don’t actually need gradients. So return grad_input, None.

(raff) #3

Thank you. It seems to work.

(Heng Cher Keng) #4

can i confirm that there are two ways to write customized loss function:

  1. using nn.Moudule
    Build your own loss function in PyTorch
    Write Custom Loss Function

Here you need to write functions for init() and forward().
backward is not requied. But how do I indicate that the target does not need to compute gradient?

2)using Functional (this post)

Here you need to write functions for both forward() and backward()

(raff) #5

I need to also implement backward because I use some operations that autograd’s automatic gradient won’t work. In the case that you just use standard operation, I think you do not need to extend backward method.

(George Stamatescu) #7

Hi, I’m attempting to write my own custom loss function, for the log likelihood of a Gaussian, ie. least squares.
I’ve written the following:

> class LSE_loss(torch.autograd.Function):
>     @staticmethod
>     def forward(ctx, mu, sigma, x):
>         detsig = linalg.det(sigma.numpy())
>         invsig = linalg.inv(sigma.numpy())
>         result = 0.5*np.log(detsig) -0.5 * np.transpose(x - mu.numpy(),(0,2,1)).matmul(invsig.matmul(x- mu.numpy()))
>         ctx.save_for_backward(mu,sigma, x)
>         return torch.FloatTensor(result)
>     @staticmethod
>     def backward(ctx, grad_output):
>         x, mu, sigma = ctx.saved_variables
>         invsig = linalg.pinv(sigma.numpy())
>         grad_mu = invsig.matmul(x-mu.numpy())
>         grad_sig = -0.5*(invsig - (invsig.matmul(x-mu.numpy())).matmul((x-mu.numpy()).matmul(invsig)))
>         return torch.FloatTensor(grad_mu), torch.FloatTensor(grad_sig), None

But when I make an instance of the loss, and call loss.backward(), I get the error "TypeError: backward() takes exactly 2 arguments (0 given).

What am I doing wrong?

ref for formulae: http://www.notenoughthoughts.net/posts/normal-log-likelihood-gradient.html , I know calculating inverse s isn’t ideal, open to suggestions for alternatives…

(Spandan Madan) #8

Here’s my example for how to create a custom loss function (along with several other important things in PyTorch). See if going through it is of any help!

(Shuokai Pan) #9

Hi George,

Have you solved your problem? I guess it may be because of the type of the variables in your forward method are all numpy arrays. The error message effectively said there were no input arguments to the backward method, which means, both ctx and grad_output are None. This then means ‘ctx.save_for_backward(mu, signa, x)’ method did nothing during forward call. Maybe change mu, sigma and x to torch tensors or Variable could solve your problem.