Only update weight of one model in training

I have two models A and B. The output of model A is fed to model B and the loss is calculated(example below)

opA = modelA(inp)
opB = modelB(opA)
loss = criterion(opB, GT)

I want to update weights of model A only. How can I achieve this?

Create an optimizer with the parameters of modelA only, calculate the gradients using loss.backward() can call optimizerA.step().
If you never want to update modelB, you could set the requires_grad attributes of its parameters to False additionally.

1 Like

Thank you. But,
Is this the same as setting modelB to eval mode?
and,
If I want to update the weights of both the models, can I achieve this by creating two optimizers?

optimizer_A = torch.optim.Adam(modelA.parameters())
optimizer_B = torch.optim.Adam(modelB.parameters())

And in training loop
optimizer_A.zero_grad()
optimizer_B.zero_grad()
opA = modelA(inp)
opB = modelB(opA)
loss = criterion(opB, GT)
loss.backward()
optimizer_B.step()
optimizer_A.step()

No, setting the model to .eval() will change the behavior of some layers (e.g. dropout will be disabled and batchnorm layers will use their running stats) and will not change the behavior of the gradient calculation.

Yes, that works.

1 Like