Custom loss function does not seems to work

gabriel175 · March 8, 2018, 7:00am

Hi,

I have an output vector and a ground truth vector. I want to minimize the cosine distance between them but i have a constraint - i want that the output vector will have a l2-norm of 1 so i have created the following custom loss:

class MultiCosineLoss(nn.Module):
    def __init__(self):
        super(MultiCosineLoss, self).__init__()
    def forward(self, outputs, targets):
        cosine_target = Variable(torch.ones(len(outputs)).cuda())
        loss_func = nn.CosineEmbeddingLoss().cuda()
        loss1 = loss_func(outputs, targets, cosine_target)

        loss_func2 = nn.L1Loss().cuda()
        output1_normed = output1.norm(p=2, dim=1)
        target = Variable(torch.ones(len(outputs), 1)).cuda()
        loss2 = loss_func2(outputs_normed, targets)

        return loss1 + loss2

Outputs is a batch of size 32 that each is a vector of 60 dims.
So i am trying to minimize the cosine distance + |1-norm(output)|
But when i add the L1Loss the network does not learn anything and without the L1Loss i get accuracy of 98%.
So i guess that i am not using the L1 loss appropriately.
Can you please advise how can i solve it?

Thanks!

gabriel175 · March 14, 2018, 9:39am

Any suggestions?

Thanks

dronline · March 14, 2018, 9:57am

Try looking at the values that the different losses take. I did not try to reproduce your code or anything, but when I use multiple losses like this, i usually do it like:

loss1 + alpha * loss2 + beta * loss3 + …

alpha, beta are set in the init and should be optimized as hyperparameters. Hope that helps.

gabriel175 · March 21, 2018, 11:40am

Thanks, but i have already tried it…
Waiting for more suggestions.

bheinzerling · March 21, 2018, 12:12pm

It looks like you’re mixing up L1 and L2 losses and norms both in your description and code (although it’s a bit hard to tell since the variable names are incosistent).
If you want your vectors to be L2-normalized, the corresponding loss is MSELoss.
If you want your vectors to be L1-normalized then the p in output1.norm should be 1.

gabriel175 · March 21, 2018, 12:18pm

The main idea is that i want to minimize the cosine distance and also have a l2-norm of 1.
So there are 2 stages:
output is a batch of vectors, so at first i calculate the l2-norm of each vector:

output1_normed = output1.norm(p=2, dim=1)

and then i want to put a constraint that each norm should be 1 so i use:

loss_func2 = nn.L1Loss().cuda()
target = Variable(torch.ones(len(outputs), 1)).cuda()
loss2 = loss_func2(outputs_normed, targets)

So please notice that i am calculating:
| 1 - l2_norm(output) |

So i actually want that all of the vectors that are in output will be a unit vector. for this reason i mix the l2 and l1