i trained my module (mainly conv1d + dropout + BN) with batch size of 8.
In the evaluation, i activated model.eval(). I met a massive accuracy drop and thus i follow the post here: Performance highly degraded when eval() is activated in the test phase - #71 by Yuxuan_Xue, which solves the issue when i set the batch size to 1. More specifically, my model performs well when the running stats of batch norm are manually set in evaluation mode.
@staticmethod def set_batch_norm_running_stats(module): for m in module.modules(): for child in m.children(): if type(child) == nn.BatchNorm2d or type(child) == nn.BatchNorm1d: child.track_running_stats = False child.running_mean = None child.running_var = None
currently, i meet an issue: i can only use batch size of 1. If i use a higher batch size (e.g. 2, 4, or 8), the model again has massive accuracy drop.
Also one weird thing is that i have two modules containing batch normalization. I did the ablation study (set each module with
set_batch_norm_running_stats function separately and do evaluation). However, only one module is affected. Another module is not affected and doesn’t need to be set the running stats manually to perform normally.
I am really confused now and really appreciate for any suggestions and discussion.