[SOLVED] No training progress when using my own L1Loss function

I just implemented simple L1 loss function in order to implement more complex loss functions later. I checked the result of my own L1 loss function with nn.L1Loss the results are similar. But unfortunately when I use my own L1 loss to train a simple network, there is no progress in training while using nn.L1Loss works fine.

Here is the simple L1Loss implementation:

from torch.nn import Module
import torch

class MyLoss(Module):
    def forward(self, x, y):
        """
        x,y: both have (batch_size, input_size) dimension
        """
        d = y - x
        d *= 1 / y.size()[1]
        d = d.abs()
       return torch.sum(d) / y.size()[0]
        

Here is the testcase:

from torch.autograd import Variable
import torch
from torch import nn
from loss import MyLoss
import unittest

class TestLoss(unittest.TestCase):
    @classmethod
    def setUpClass(self):
        self.criterion = MyLoss()
        self.l1loss = nn.L1Loss()

    def test_loss(self):
        y = Variable(torch.Tensor([[2, 1, 0, -1, -2], [2, 1, 0, -1, -2]]))
        x = Variable(torch.Tensor([[1, 1, 1, 1, 1], [5, 4, 1, 1, 1]]))
        self.assertEqual(self.l1loss(x,y).data[0],
                        self.criterion(x,y).data[0])

if __name__ == '__main__':
    unittest.main()

Have you implemented an init for your loss?

Thanks for your reply.

No I didn’t implement the __init__ method, I just add __init__ method but it doesn’t work yet.

class MyLoss(Module):
    def __init__(self):
        super(MyLoss, self).__init__()

    def forward(self, x, y):
        d = y - x
        d *= 1 / y.size()[1]
        d = d.abs()
        return torch.sum(d) / y.size()[0]

It seems I found the problem, gradients vanish after few training iteration when I use my own L1Loss. But when using nn.L1Loss there is no problem.

I’m not entirely sure what could be causing this. It seems that the L1-loss implemented by pytorch uses torch.mean rather than division by size. Have a look at the functions _pointwise_loss and l1_loss here:

1 Like

Thanks,

Finally I use pairwise_distance instead of minus operator, and now the code is work fine.
Here is the working code:

class CrossOverLoss(nn.Module):
    def __init__(self):
        super(CrossOverLoss, self).__init__()

    def forward(self, x, y):
        batch_size, n = y.size()
        x = x.squeeze()
        d = pairwise_distance(x, y)      
        return torch.sum(d) / batch_size
1 Like