I have two models, assuming Model_1 and Model_2. Model_2 is pretrained and I want to use its output to compute loss and train Model_1.
Looks something like this:
From albanD’s comment, you could pass only Model_1 parameters to the optimizer (this is what you already did).
This thread is discuss if we should turn the pre-trained model to eval mode. In my shallow opinion, calling Model_2.eval() will have some difference as follows:
the parameters of Model_2 will not update in optimizer.step(). This result is what your optimizer configuration already achieved.
change dropout layer and batchnorm layer into eval mode.
So we can see that it will only have the second difference, but I have not met this problem, you could have a try on it whether turn off dropout and batchnorm layers in a fixed model will have influence on the post-networks.
If you get something different, please let me know, thanks!
Thank you,
Basically, I want to my Model_2 act exactly as a finalized model (I mean only testing and no train at all) so I just use its output for further usage.
And based on your comment, I should use model.eval().
I will report differences when I finish the job.