Normalization on width dimension only

Could you try to manually write the normalization and allow nvFuser to code-gen the code for you similar to this use case?