How does BatchNorm keeps track of running_mean?

Ok, I am now playing with these complex networks (or at least my students do), still working on it but it is available on my GitHub: https://github.com/wavefrontshaping/complexPyTorch

Thanks for the amazing support,

Sebastien

@ptrblck how does running_mean and running_var work when using multiple GPUs in pytorch batch norm version, each GPUcalculate the update on its on and this is averaged some how across GPUs? or only the master GPU set those parameters for the next batch?

The “normal” batch norm layer would use the latter approach, while SyncBatchNorm would synchronize between devices.

1 Like

Thank you @ptrblck for replying.