I stumbled upon the Performance Tuning Guide and read that the bias can be set to true when using Conv2d followed by a BatchNorm2d.
I wondered if that is true for LayerNorm and GroupNorm as well? And what about Linear followed by LayerNorm? Can I set bias=False when using LayerNorm right after Linear? How can I check if the bias term is actually required?