I stumbled upon the Performance Tuning Guide and read that the bias
can be set to true when using Conv2d
followed by a BatchNorm2d
.
I wondered if that is true for LayerNorm
and GroupNorm
as well? And what about Linear
followed by LayerNorm
? Can I set bias=False
when using LayerNorm
right after Linear
? How can I check if the bias term is actually required?