I get the fixed output in the test phase when I set model.eval()

shengyudingli · April 12, 2019, 12:24pm

I use nn.BatchNorm2d in my model. When I test my model and set model.eval() (which will just affect BN and Dropout layer), I get the approximately fixed outputs, which means I feed different inputs to the network, but the outputs are the same. I don’t know why, I sincerely hope the someone can help me solving this confusion.

ptrblck · April 12, 2019, 7:54pm

Maybe your model might just have learned the “mean prediction value” and thus would need some more training.
Could you check the bias values of the output layer and compare them to the output values?
I’ve seen some cases where my model get stuck during training and just predicted the mean of all target values (facial key points).

shengyudingli · April 14, 2019, 7:47am

Thank you very much, I will check it. By the way, if I remove this statement (model.eval() ), I can get the right result. Do you think I have to set model.eval() in the test phase?

ptrblck · April 14, 2019, 1:09pm

You would have to call model.eval() to change the behavior of some layers like nn.BatchNorm and nn.Dropout. However, if your validation works better in train mode, this might be a sign that the running estimates of your batchnorm layers are not really useful. Which batch size are you using during training and evaluation? Could you try to change the momentum of the batchnorm layers or remove them completely?

shengyudingli · April 15, 2019, 2:21am

The batch sizes are 17 in both train and evaluation phase, Do you think nn.BatchNorm should works on big training data?

ptrblck · April 15, 2019, 10:36am

Depending on the stats of your dataset, you might also try to use nn.GroupNorm which should perform better on smaller batches.

dawn · May 29, 2019, 2:41pm

I have the same issuie, and my model is bert, which is just linear layer, droupout layer and layernorm layer, the result is always the same when model is eval mode, how can I debug this？