optimizer = torch.optim.SGD([{'params':net.parameters() ,'lr':0.1},
{'params':net.conv_r1.bias, 'lr':0.0} ], lr=0.1, momentum=0.9)
I use the above code to freeze net.conv_r1.bias, But got:
ValueError: some parameters appear in more than one parameter group
So, what should I do to only freeze net.conv_r1.bias while train all other params in net
THX!
Hi there!
I wonder if you just delete the net.params out of the SGD , how could this SGD train the whole net.parameters()?
since it is not in the SGD ‘params’
#SET CONDITIONAL LEARNING RATES IF NECESSARY
model_parameters = []
for n,p in model.named_parameters():
if n.find('layer_name') != -1:
model_parameters.append({'params': p, 'lr': LR})
else:
model_parameters.append({'params': p, 'lr': LR})
optimizer = torch.optim.SGD(model_parameters,lr=LR, weight_decay=WEIGTH_DECAY)
1 Like
Thx juan,
But in this way,
optimizer = torch.optim.SGD(model_parameters,lr=LR, weight_decay=WEIGTH_DECAY)
The lr in the SGD() seems not funtion, right? since you already pre set all the lr for each param
,lr=LR,
That’s right, but I guess it’s a mandatory input.
Anyway you can do it as you want, the fact is that, as you can see, you cannot repeat a paramater. You can manage to do it as you prefer
1 Like
Thx, I would try your method to solve this problem
thanks for help:handshake:
Note that, if you have subnetworks, you can apply this way only to one of them calling model.subnetwork.named_parameters(). The way I present the solution if the most general one.
For example this is another solution (equivalent), proposed by ptrblck
optim.SGD([{'params': [param for name, param in model.named_parameters() if 'fc2' not in name]}, {'params': model.fc2.parameters(), 'lr': 5e-3}], lr=1e-2)
filtering by an if statement but without setting the learning rate. This way those parameters without a specifict learning rate will use the global one
1 Like
Wow, thanks to both you and ptrblck, this method seems perfect