Hot to train neck and head only in warmup

How to config the optimizer so that it only trains neck and head during warm up, and with backbone in training.

You could freeze the backbone parameters by setting their .requires_grad attributes to False during the warmup iterations, and then to True during the complete training.
This would make sure that these parameters won’t get any gradients and the optimizer won’t update them in its step() operation.

Thank you for your answer. What about add the param group of backbone to optimiser after warm up is finished? Which one is a better practice?

Both should work equally well and it might just depend on your personal coding style / preference.

I have the same problem, which is unsolved unfortunately :frowning:
I have modelA as backbone and modelB as aux output, which means my whole model has two outputs. What I wanna do is freeze pre-trained backbone and train modelB only. But either way that you suggested would touch the weights in backbone. Any suggestion?

If modelB is “on top” of modelA, you still could freeze the parameters of modelA and just train modelB.
Why would the modelA weights be touched?

Thanks for reply. Yes, it is exactly as you said. This is a piece of code, which may help to locate the problem.
(BTW, pytorch version is 1.4.0)

create graph and load weigths for modelA

checkpoint = torch.load(model_path, map_location=device)
modelA = BackboneNet().to(device)
modelA.load_state_dict(checkpoint[‘backbone’])
modelB = AuxNet().to(device)

freeze modelA

for param in modelA.parameters():
param.requires_grad = False

asign modelB to train

optimizer = torch.optim.Adam(
[{
‘params’: modelB.parameters()
}],
lr=base_lr,
weight_decay=weight_decay)

It seems like add model.eval() after set requires_grad works! Thanks @ptrblck for help~