I have a loss function for a Model (model 1) which has a term calculating based on output of other model (model 2). So model 2 is not trained. I want to disable gard to get the output of model 2 faster. but I want to enable it for model 1.
input = …
output1 = model1(input)
#disable gard
output2 = model2(input)
#enable again
loss = criterion(output1, target) + function(output1,output2)
Now I wonder, The model 2 should be in eval() mode or not. I do not want to train this model. I am trying to distill knowledge from it to model 1. So model 1 should produce out same as model 2 for same batches.
model.eval() would disable e.g. dropout layers and would use the running stats of batchnorm layers.
Depending on your use case, you might need to call it.