How to calculate the FLOPs of a BatchNorm layer?

Hi, I’m trying to calculate the FLOPs of a BatchNorm2d layer, can anyone tell me how it is implemented in Pytorch? Is it always automatically fused into previous conv layer or not?

I’m not sure if a “true” FLOPS number can be calculated, as it depends on many vaiables (CUDA/MKL version, Driver version, model params, etc.) But the profiling tools might give you a good idea of computational cost.

Thanks for your reply! Will take a look at the tools you mentioned.

How about this one? They did support batchnorm as well

Thanks for the reply. I’ve checked their code but it seems that they only implemented for inference, while I was trying to get a full estimation of FLOPs on training. But I guess just as tymokvo said, without knowing how the batchnorm is implemented in cuDNN, I can’t make any reasonable estimation.