Regularization in test phase?

Aymen_Sadraoui · February 8, 2022, 4:29pm

Hello, I want to add L1 regularisation to my model but I am confused should I calculate loss function in test phase with adding the weights penalty?

in train:
loss=criterion(output,y)
parameters = [parameter.view(-1) for parameter in model.parameters()]
l1 = l1_lambdamodel.compute_l1_loss(torch.cat(parameters))
loss += l1 # Add L1 loss component
in test:
should I calculate the loss like this?:
loss=criterion(output,y)
parameters = [parameter.view(-1) for parameter in model.parameters()]
l1 = l1_lambdamodel.compute_l1_loss(torch.cat(parameters))
loss += l1 # Add L1 loss component
or like this?
loss=criterion(output,y)

#Regularization #L1_regularization #test_phase

pascal_notsawo · February 8, 2022, 4:44pm

You don’t need to do it when testing your model. It is done during training for the update of the parameters, which is not done during the test of the model.

mMagmer · February 8, 2022, 4:57pm

I think the second one is the way to go. But it only matters if you’re going to use it as an early stop metric. In classification applications, loss has direct relation to error rate (it also considers by how much margin the classifier predict correct output), so the model with lower loss is better estimator for class probabilities. Regularization technique i.e. L1, max entropy regularizer (label smoothing) are not related to error rate in any way.
This is my take.

Aymen_Sadraoui · February 8, 2022, 5:52pm

but when i calculate the loss in train and test I find a lot of differences like:
loss_train: 658.215
test_loss: 0.6542

mMagmer · February 8, 2022, 6:13pm

How do you compute and add the regularization term.
It should be something like this.

loss = criterion(output,y) + l1_lambda*L1(parameters)

Where l1_lambda is a small number, i.e. 1e-5.
Also check it without any regularization to see if some other thing may cause the problem (like different batch size in train and test)

Aymen_Sadraoui · February 8, 2022, 6:19pm

yes that’s is, i compute the sum of the absolute value of the weights and I add it to the calculated loss.