Different batch size causes accuracy drop

Hi all,

i trained my module (mainly conv1d + dropout + BN) with batch size of 8.

In the evaluation, i activated model.eval(). I met a massive accuracy drop and thus i follow the post here: Performance highly degraded when eval() is activated in the test phase - #71 by Yuxuan_Xue, which solves the issue when i set the batch size to 1. More specifically, my model performs well when the running stats of batch norm are manually set in evaluation mode.

  def set_batch_norm_running_stats(module):
      for m in module.modules():
          for child in m.children():
              if type(child) == nn.BatchNorm2d or type(child) == nn.BatchNorm1d:
                  child.track_running_stats = False
                  child.running_mean = None
                  child.running_var = None

currently, i meet an issue: i can only use batch size of 1. If i use a higher batch size (e.g. 2, 4, or 8), the model again has massive accuracy drop.

Also one weird thing is that i have two modules containing batch normalization. I did the ablation study (set each module with set_batch_norm_running_stats function separately and do evaluation). However, only one module is affected. Another module is not affected and doesn’t need to be set the running stats manually to perform normally.

I am really confused now and really appreciate for any suggestions and discussion.