I understand that batchnorm stats are buffers. Actually I ran into the problem like in this post. In answers you suggested disabling buffers synchronization. But it means that each model on each gpu will receives stats from small batch (is it?), and I want to update batchnorm with stats using batches from all gpus.