Tuning batch normalisation

Sravani_Lekkala · June 3, 2020, 2:56pm

will we tune eps and momentum in batch norm often in practice

googlebot · June 3, 2020, 7:59pm

eps is there for numerical stability, so it is unusual to tune it. momentum implicitly defines number of last samples used to estimate (population) moments, so it can be reasonable to decrease momentum, if estimates are not changing smoothly (for example due to small batch size)

Sravani_Lekkala · June 4, 2020, 6:05am

so it would be better to decrease the momentum when we are using small batch size to improove the accuracy?

googlebot · June 4, 2020, 3:13pm

Well, it won’t hurt, to have say batch_size/momentum constant, when decreasing batch size.

I’d rather say that you can sometimes fix batchnorm problems by using smaller momentum, i.e. you can only improve subpar results and only when batchnorm is the issue.