Batchnorm unstable in 1D ResUnet

Hi guys,

so I train a 1D ResUnet for task like denoiser. And I kind of automatically use batchnorm and set bias of Conv to false, since everyone say this should stable the training. My architecture is like (Conv → Bn → PReLU → Resblock (3x (Conv→BN→ PRelu))). But when I check one of my model without batchnorm, and the bias in Conv set to true, I see a very weird behavior. The right one, without BN is much more stable in val/train loss. Do you happen to have an explanation for this? Thank you.

Cheers

If you’re using batchnorm after a Conv layer, then the bias in the Conv layer becomes redundant. So it saves memory/compute to just remove it. Stability could also become an issue if you didn’t implement gradient clipping.

But as to why you’re seeing better results without BatchNorm, this could be due to a small batch size.

It also could be due to BatchNorm being somewhat counterproductive to denoising tasks, where you may have random noise across the batch dimension. Perhaps consider changing this to InstanceNorm, instead.

PReLU might also be interacting counterproductively with the BatchNorm layer. Since BatchNorm rescales activations, this might not behave well before a PReLU layer, which is trying to learn how to shift the data’s asymmetries.

Thanks for your reply.

Since I have 1D signal only, I have batch size = 256, so I don’t think the problem could come from the batch size.

And yes, when I activate the batch norm, I set the bias to false for the conv layer, and then the train / val loss become very noisy like the left plot