Batch Normalization

I was wondering whether we should use batch Normalization before or after the activation layer, also do they really have an effect in case of shallow CNNs