Training Model_1 using output of Model_2 which is pretrained whether using model_2.eval() or not in train method

Nikronic · May 11, 2019, 8:32am

Hi everyone,

I have two models, assuming Model_1 and Model_2.
Model_2 is pretrained and I want to use its output to compute loss and train Model_1.
Looks something like this:


def train(model1,model2, *args):
  Model_1.train()
  Model_2.eval()
  
  out1 = Model_1(input)
  out2 = Model_2(out1)
  loss = criterion(out2)
  ...

optimizer = torch.optim.SGD(Model_1.parameters(), *args)

Here is a similar question I’ve got some ideas.

My question is should I use Model_2.eval() even when I am passing only Model_1 parameters to the optimizer?

Thanks for any advice

MariosOreo · May 11, 2019, 1:01pm

Hi,

From albanD’s comment, you could pass only Model_1 parameters to the optimizer (this is what you already did).

This thread is discuss if we should turn the pre-trained model to eval mode. In my shallow opinion, calling Model_2.eval() will have some difference as follows:

the parameters of Model_2 will not update in optimizer.step(). This result is what your optimizer configuration already achieved.
change dropout layer and batchnorm layer into eval mode.

So we can see that it will only have the second difference, but I have not met this problem, you could have a try on it whether turn off dropout and batchnorm layers in a fixed model will have influence on the post-networks.

If you get something different, please let me know, thanks!

Nikronic · May 11, 2019, 2:24pm

Thank you,
Basically, I want to my Model_2 act exactly as a finalized model (I mean only testing and no train at all) so I just use its output for further usage.
And based on your comment, I should use model.eval().
I will report differences when I finish the job.