My network performance on the test set gets much worse after some iterations when applying the model. eval() statement. However, if I do the same thing without having model. eval(), the performance is much better. Does anyone know how I can solve this problem?
could you share some code snippet?
thanks for your reply.
@1chimaruGin
So my network is almost like Unet.
and let’s say I train/evaluate my model in the following ways:
model.eval()
evaluate the model on test set
model.train()
train the model on train set
model.train()
evaluate model on test set
model.train()
train the model on train set
my batch size is 5.
So, the only difference is that in the second case, before evaluating the model on test set I don’t call model.eval().
In the second case I get a better performance. I have both dropout and batch norm in my network and I know that they behave differently if calling model.eval(). I think the problem is because of batchnorm and maybe running stats but I don’t know how to fix it.
Did you mean evaluation accuracy and loss?
I think if your model doesn’t include Dropout
and BatchNorm
, model.train()
and model.eval()
won’t be different. If not, each of them can effect the result. For me, evaluation with model.eval()
seem legit. Correct me if I’m wrong.
Pytorch source code
def train(self: T, mode: bool = True) -> T:
r"""Sets the module in training mode.
This has any effect only on certain modules. See documentations of
particular modules for details of their behaviors in training/evaluation
mode, if they are affected, e.g. :class:`Dropout`, :class:`BatchNorm`,
etc.
Args:
mode (bool): whether to set training mode (``True``) or evaluation
mode (``False``). Default: ``True``.
Returns:
Module: self
"""
self.training = mode
for module in self.children():
module.train(mode)
return self
def eval(self: T) -> T:
r"""Sets the module in evaluation mode.
This has any effect only on certain modules. See documentations of
particular modules for details of their behaviors in training/evaluation
mode, if they are affected, e.g. :class:`Dropout`, :class:`BatchNorm`,
etc.
This is equivalent with :meth:`self.train(False) <torch.nn.Module.train>`.
Returns:
Module: self
"""
return self.train(False)
- official pytorch tutorial with
model.eval()
andmodel.train()
Yes. I mean accuracy.
And yes, you are right. I have batch norm and drop out.
However, the problem is when I exclude model.eval() ( i.e., perform evaluation without executing model.eval() ) my performance is way better than when I do the evaluation after executing model.eval().
The accuracy when having model.eval() doesn’t make any sense. It’s like the network is not learning at all. However, when I remove model.eval() the accuracy on the test set actually makes sense.
Did you use multiple layers in your forward method with the same name from your init method?
e.g.
in init,
self.conv1 = nn.Conv2d(…)
in forward,
…
x = conv1(x)
x = conv1(x)
…
but instead do:
init,
self.conv1 = nn.conv2d(…)
self.conv2 = nn.conv2d(…)
forward,
…
x = conv1(x)
x = conv2(x)
…
I encountered a similar issue with eval() giving different results, try it
I recall a forum post that solved this issue, i believe it could be related to some pytorch internals