Hi, thanks for the prompt reply!
I’m using this to replace BatchNorm2D
, so would need a running std estimation. Your mentioned approaches don’t quite fit. I’ve edited the title accordingly.
I can adapt your manual implementation given here, but there’s a performance (compute/memory) degradation compared to native BatchNorm2D.