Hi,
I am working with a scripts that each training instance has different mumber of points, so when I applied DDP with syncBN, how will the BN stats in different GPUs to sync? I have traced the source code to here “https://github.com/pytorch/pytorch/blob/a4a5b6fcaae26fe241d32a7c4b2091ee69b600bb/torch/nn/modules/_functions.py” L33-L43
# calcualte global mean & invstd
mean, invstd = torch.batch_norm_gather_stats_with_counts(
input,
mean_all,
invstd_all,
running_mean,
running_var,
momentum,
eps,
count_all.view(-1).long().tolist()
)
In my case the “mean_all” and “invstd_all” should be weighted average accroding to different “counts” in GPUs, is it the actual situation?
BTW, the syncBN in NVIDIA apex just simply average “mean_all” and “invstd_all” which not support for different counts in GPUs.
Thanks very much