I have the same problem.
I’m trying to load caffe weights in a pytorch model with batchnorm layers, each time I load the weights from the caffemodel file, the result for the same input is different even in eval mode.
I’m actually updating the running_mean and running_var from the caffemodel weights, so there shouldn’t be any issue with bad running_means during inference.