Freezing weights in pytorch for param_groups setting.
So if one wants to freeze weights during training:
for param in child.parameters():
param.requires_grad = False
the optimizer also has to be updated to not include the non gradient weights:
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=opt.lr, amsgrad=True)
If one wants to use different weight_decay / learning rates for bias and weights/this also allows for differing learning rates:
param_groups = [{'params': model.module.bias_parameters(), 'weight_decay': args.bias_decay},
{'params': model.module.weight_parameters(), 'weight_decay': args.weight_decay}]
param_groups a list of dics is defined and passed into SGD as follows:
optimizer = torch.optim.Adam(param_groups, args.lr,
betas=(args.momentum, args.beta))
How can this be achieved with freezing individual weights? Running filter over a list of dics or is there a way of adding tensors to the optimizer separately?