I know there is lots of difference between .eval() and .train() in Dropout layer and BatchNorm layer.
So I write two simple method to locate the reason.
def dropout_disable(m):
print(m)
if type(m)==nn.Dropout:
m.eval()
def bn_disable(m):
print(m)
if type(m)==nn.BatchNorm1d:
m.eval()
Following is my experiment.
net.train() test,f1_score=0.77
net.apply(dropout_disable),test,f1_score=0.78
net.apply(bn_disable),test,f1_score=0.77
net.eval(),test ,f1_score=0.77 ,It is same with the above result.
net.train(),test,f1_score=77%(the result is different each time as there is randomness in dropout layer)
net.eval(),test,f1_score=0.70 .This time the result is degradation largely.
I don’t know why this result is differ from the above result 0.77???
Is there other difference between .train() and .eval()?
or it is a bug of pytorch???