How to add L1-regularization term to the MSEloss

I want to add the L1 regularization to the MSELoss function to get a sparse net model, so i write the following code. However, i can not see any of the net weight shrinking to zero. As i know, L1 regularization will make more weights zero. So, what’s the problem? Any mistake in my code?

loss_func = t.nn.MSELoss()
optimizer = t.optim.SGD(net.parameters(), lr)

for epoch in range(EPOCH):
    for i, data in enumerate(train_loader):
        inputs, labels = data
        inputs, labels = Variable(inputs), Variable(labels)
        prediction = net(inputs)
        loss = loss_func(prediction, labels)
        param = list(net.named_parameters())
        param0 = np.array(param[0])[1].data
        param1 = np.array(param[2])[1].data
        loss = L1_lambda * (t.norm(param0, 1) + t.norm(param1, 1)) + loss
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

You are using .data which basically gets the data values of tensors and detached from the computation graph. So, the loss and its gradients cannot be back-propagated to the network.

1 Like

Thanks for your reply, what i can do to make them can be back-propagated to the network?

Do not convert them to NumPy arrays. Try passing the original tensors to t.norm() function.

1 Like