I see in most caffe prototxt that, for conv layers, the lr for the convolution kernel and lr for the bias is different (lr for bias is usually 2x lr for kernel). Does it make sense to assign lr like this? If so how could I do that in pytorch? I know I could use optimizer with different parameter lists, however, it is not so convenient if there are a lot of conv layers and thus I should need to construct a tedious optimizer parameter list. Could I assign lr amplifying ratio when defining the model structure like that in the caffe ?