Essentially, after you have calculated the gradients via backward and updated the weights, you need to clear the gradients. Otherwise, pytorch will keep accumulating gradients. You can find more details here: and you can check the following discussion: Why do we need to set the gradients manually to zero in pytorch?