Hi, let’s say I have a convolution layer with weights of size: [64,64,3,3]. now I want to give different lrs , acctualy I want to freeze part of the kernel (for pure research), I first tried:
model.layer_name.weight[1:,:,:,:].requires_grad = False
This returned:
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
Furthermore, I tried to think how to use the normal format I’m using to define different lr for different layers:
parameters= []
ft_module_names = ['layername']
for k, v in model.named_parameters():
for ft_module in ft_module_names:
if ft_module == k:
parameters.append({'params': v, 'lr':args.lr_new})
break
else:
parameters.append({'params': v})
optimizer = torch.optim.SGD(parameters, args.lr,
momentum=args.sgd_momentum,
weight_decay=args.weight_decay)
But I wasn’t able to think how to modify it so that it will be compatible for different lrs in the same layers
I’ll be happy if you have any ideas!
Thanks!