Hi, I’m a beginner at Pytorch. I figured out how to give L2 regularization, using weight_decay arguments of optimizers.
optimizer = optim.SGD(net.parameters(), lr=0.1, momentum=0.9,weight_decay=1e-4)
However, I’m not sure how to give different L2 regularization factors to different parameters. Suppose the model has two convolution layers, and give 1e-4 & 2e-4 in each layer’s weight.
Then should I have to use optimizer1 (first layer params) & optimizer2 (second layer params)? or is there any way to exclude L2 regularization in specific parameters, and give different L2 factors?