Hi there. I’m trying to update the weights of multiple models at once where the optimizer is set as:
optimizer = torch.optim.SGD(list(netA.parameters()) + list(netB.parameters())+ list(netC.parameters()), args.lr)
For the weight update, I performed loss.backward()
and checked that all the parameter’s some_param.weight.grad
is not None
. I ran optimizer.step()
and always find that only the parameters of netA
are updated but not the rest. Any idea on what am I missing here? Thanks!
I found the problem. The other models are actually based on the first order gradient of the first model, hence the gradient becomes very small and are attenuated.