A module produces different gradient for same input?

gradients are accumulated. So in the repeat case, what you are seeing is gradient of first call + gradient of second call.

before the line # repeat without updating the weight, insert this call:

modL.zero_grad()
1 Like