Training Model_1 using output of Model_2 which is pretrained whether using model_2.eval() or not in train method

Hi everyone,

I have two models, assuming Model_1 and Model_2.
Model_2 is pretrained and I want to use its output to compute loss and train Model_1.
Looks something like this:

def train(model1,model2, *args):
  out1 = Model_1(input)
  out2 = Model_2(out1)
  loss = criterion(out2)

optimizer = torch.optim.SGD(Model_1.parameters(), *args)

Here is a similar question I’ve got some ideas.

My question is should I use Model_2.eval() even when I am passing only Model_1 parameters to the optimizer?

Thanks for any advice


From albanD’s comment, you could pass only Model_1 parameters to the optimizer (this is what you already did).

This thread is discuss if we should turn the pre-trained model to eval mode. In my shallow opinion, calling Model_2.eval() will have some difference as follows:

  • the parameters of Model_2 will not update in optimizer.step(). This result is what your optimizer configuration already achieved.
  • change dropout layer and batchnorm layer into eval mode.

So we can see that it will only have the second difference, but I have not met this problem, you could have a try on it whether turn off dropout and batchnorm layers in a fixed model will have influence on the post-networks.

If you get something different, please let me know, thanks!

Thank you,
Basically, I want to my Model_2 act exactly as a finalized model (I mean only testing and no train at all) so I just use its output for further usage.
And based on your comment, I should use model.eval().
I will report differences when I finish the job.