Weight decay together with L2 regularization baked into loss function

Based on what I have been reading here, one can get L2 regularization by providing a value other than 0 to the optimizer through the argument weigh_decay.

Yet, one may implement a custom loss function like this one where the L2 regularization is already taken into account:

class AutoRec_Loss(torch.nn.Module):

    def __init__(self):

    def forward(self,predicted_ratings, real_ratings, weights, reg_strength):
        ratings_loss = torch.norm(real_ratings - predicted_ratings)
        # L2 regularization
        weights_regularization = (reg_strength/2)*torch.norm(weight)
        return ratings_loss + weights_regularization

What would happen if I set a value other than 0 to the underlying optimizer given this loss function?


It will just increase even more the l2 regularization. You loss would be loss = ratings_loss + weight_regularization + weight_decay * weight_norm.

1 Like

Hi @albanD,

Makes sense! Thank you for your clarification :wink:

May I ask in the weight decay, it is l2 norm for weight_norm in your formula?