First Linear Layer Weight Regularization - seeking classification

I was wondering how to do First Linear Layer L1 Weight regularization, for feature engineering. Out of curiosity I want to see what a MLP thinks the top N features are. I read this post

I’m probably mistaken, but this seems wrong… past answers recommend for W in model.parameters(), so in my case, where my model’s first Linear layer is L1 it would be for W in model.L1.parameters(). But this includes the bias term!.

Most posts guilty of this, however 1 saw one that is in line with my expectation.

So what’s going on here, who is mistaken? I think that regularizing the bias probably isn’t too bad, it will be tiny, and doesn’t matter if there will be normalization done directly after.

See if this works for you (applying L1 regularization for layer L1):

for name, param in model.named_parameters():
    if 'L1' in name and 'weight' in name:
        L1_reg = L1_reg + torch.norm(param, 1)

This can also be modified for L2 regluarization.