I have three neural networks, model1
, model2
and model3
. I feed an input to model1
and output of this network is fed to model2
and further, the output of the model2
is fed to model3
. I have two loss functions. One for model2
alone and one a global loss function which optimizes model1
and model3
weights only. model2
is optimized only for a few epochs, after which it is run in forward prop only.
I have defined the optimizers like below:
global_opt = optim.Adam(list(model1.params())+list(model3.params()))
model2_opt = optim.Adam(model2.params())
For the first few epochs I compute the following:
criterion = nn.MSE()
global_loss = criterion(obtained, expected)
global_loss.backward()
global_opt.step()
model2_crit = nn.BCE()
model2_loss = model2_crit(a, b)
model2_loss.backward()
model2_opt.step()
While executing these, all networks are set to train using model1.train()
, model2.train()
and model3.train()
.
After the first few epochs, I want to stop training model2
. I can’t set model2.eval()
because my model2
contains RNNs and the gradients have to flow through model2
to model1
.
So in the training loop I only have:
criterion = nn.MSE()
global_loss = criterion(obtained, expected)
global_loss.backward()
global_opt.step()
My question is (it might be silly), does model2
still update its weights after the training has subsided (when the last piece of code is the only running code in the training loop)? Thank you for reading this question.