Hi,
The L2 regularization on the parameters of the model is already included in most optimizers, including optim.SGD
and can be controlled with the weight_decay
parameter as can be seen in the SGD documentation.
L1 regularization is not included by default in the optimizers, but could be added by including an extra loss nn.L1Loss
in the weights of the model.
l1_crit = nn.L1Loss(size_average=False)
reg_loss = 0
for param in model.parameters():
reg_loss += l1_crit(param)
factor = 0.0005
loss += factor * reg_loss
Note that this might not be the best way of enforcing sparsity on the model though.