I’m trying to train a model with batch normalization.
Though, one sample is quiet memory consuming and I cannot train with enough number of batch to apply batch normalization.
So, I’m thinking following steps.
- Feed some samples and calculate running mean, var, and other parameters.
- Copy above parameters to models, then, make batchnorm layers eval mode.
Here, my questions are
- Even if batchnorm layers are eval mode, does autograd works correctly?
- Is there any efficient way to achieve those process with pytorch or any other functions?
Any help would be appreciated.