How to create compound loss MSE + L1-norm regularization

Hello!

I am trying to create a compound loss function where the first part is MSELoss and the second part is the L1-norm regularization of the model’s parameters

The fest part is simple

MSEloss = nn.MSELoss()
loss = MSEloss(rec_x, x)

But how to attach the second part?
I appreciate your help!

1 Like

You can use nn.L1Loss. See this post Simple L2 regularization?.

Thank you for your answer! I didn’t mean to use L1 loss (where predicted and actual values are in use) I need to use L1 norm regularization. My goal is to fine high weights of the model.

You can compute L1 regularization manually like this:

regularization_loss = 0
for param in model.parameters():
    regularization_loss += torch.sum(torch.abs(param))
4 Likes

But should not it be back-propagated?

Because, MSELoss for sure is backpropagated in terms of weights, so similarly L1-norm regularization should be backpropagated.

It will do backward propagation. If you are using pytorch 0.4.0, you can check it like this:

## x in range [0, 1]
x = torch.rand(3,2,requires_grad=True)
loss = torch.sum(torch.abs(x))
loss.backward()
## gradient should be all one
x.grad
1 Like

Thanks for your reply!

The problem is I am obligated to use pytorch 0.3.0 and it seems this version doesn’t support requires_grad on Tensors.

So I think I should somehow wrap the data into Variable http://pytorch.org/docs/0.3.0/autograd.html
But how to do so I don’t know.

Try this if you are using version 0.3.X.

import torch
from torch.autograd import Variable
## x in range [0, 1]
x = torch.rand(3,2)
x = Variable(x, requires_grad=True)
loss = torch.sum(torch.abs(x))
loss.backward()
## gradient should be all one
x.grad.data

So let’s get back to my tasks. I have to compose MSE loss with L1-norm regularization (among all layers’ weights)

I know how to iterate over all layers

    for name, W in model.named_parameters():
            l1 = W.norm(p=1)

But how to add all weights to Variable

The following methods don’t work

    x = Variable(self.enc_linear1, requires_grad=True)
    x = Variable(W, requires_grad=True)

Does it look right?

 l1_reg = Variable( torch.FloatTensor(1), requires_grad=True)

 for name, W in self.named_parameters():
    l1_reg = l1_reg + W.norm(1)

I am trying to do the same thing as your question, so i write the following code, but it don’t work. Have you find any solution?

loss_func = t.nn.MSELoss()
optimizer = t.optim.SGD(net.parameters(), lr)

# train the neural network
for epoch in range(EPOCH):
    for i, data in enumerate(train_loader):
        inputs, labels = data
        inputs, labels = Variable(inputs), Variable(labels)
        prediction = net(inputs)
        loss = loss_func(prediction, labels)
        for name, param in net.named_parameters():
            if 'weight' in name:
                L1_1 = Variable(param, requires_grad=True)
                L1_2 = t.norm(L1_1, 1)
                L1_3 = L1_lambda * L1_2
                loss = loss + L1_3
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

I’m trying to do the same thing you did, could you solve the problem?