Model.zero_grad only fill the grad of parameters to 0

you can return the gradient into a separate variable using a closure. Look at this post for sample code: Why cant I see .grad of an intermediate variable?