I use nn.BatchNorm2d in my model. When I test my model and set model.eval() (which will just affect BN and Dropout layer), I get the approximately fixed outputs, which means I feed different inputs to the network, but the outputs are the same. I don’t know why, I sincerely hope the someone can help me solving this confusion.
Maybe your model might just have learned the “mean prediction value” and thus would need some more training.
Could you check the bias values of the output layer and compare them to the output values?
I’ve seen some cases where my model get stuck during training and just predicted the mean of all target values (facial key points).
Thank you very much, I will check it. By the way, if I remove this statement (model.eval() ), I can get the right result. Do you think I have to set model.eval() in the test phase?
You would have to call
model.eval() to change the behavior of some layers like
nn.Dropout. However, if your validation works better in train mode, this might be a sign that the running estimates of your batchnorm layers are not really useful. Which batch size are you using during training and evaluation? Could you try to change the
momentum of the batchnorm layers or remove them completely?
The batch sizes are 17 in both train and evaluation phase, Do you think
nn.BatchNorm should works on big training data?
Depending on the stats of your dataset, you might also try to use
nn.GroupNorm which should perform better on smaller batches.
I have the same issuie, and my model is bert, which is just linear layer, droupout layer and layernorm layer, the result is always the same when model is eval mode, how can I debug this？