How to config the optimizer so that it only trains neck and head during warm up, and with backbone in training.
You could freeze the backbone parameters by setting their
.requires_grad attributes to
False during the warmup iterations, and then to
True during the complete training.
This would make sure that these parameters won’t get any gradients and the optimizer won’t update them in its
Thank you for your answer. What about add the param group of backbone to optimiser after warm up is finished? Which one is a better practice?
Both should work equally well and it might just depend on your personal coding style / preference.
I have the same problem, which is unsolved unfortunately
I have modelA as backbone and modelB as aux output, which means my whole model has two outputs. What I wanna do is freeze pre-trained backbone and train modelB only. But either way that you suggested would touch the weights in backbone. Any suggestion?
modelB is “on top” of
modelA, you still could freeze the parameters of
modelA and just train
Why would the
modelA weights be touched?
Thanks for reply. Yes, it is exactly as you said. This is a piece of code, which may help to locate the problem.
(BTW, pytorch version is 1.4.0)
create graph and load weigths for modelA
checkpoint = torch.load(model_path, map_location=device)
modelA = BackboneNet().to(device)
modelB = AuxNet().to(device)
for param in modelA.parameters():
param.requires_grad = False
asign modelB to train
optimizer = torch.optim.Adam(
It seems like add model.eval() after set requires_grad works! Thanks @ptrblck for help~