PyTorch Forums
What is the running mean of BatchNorm if gradients are accumulated?
vision
crcrpar
(Masaki Kozuki)
May 30, 2018, 4:53am
4
Yes.
Accumulated gradients will be the same if you divide them by the number of iterations. I referred below.
1 Like
show post in topic