You do call optimizer[i].zero_grad()
(or the model version) at every iteration during your epoch right? You can check this discussion if you do not: Why do we need to set the gradients manually to zero in pytorch?
Otherwise, it looks good !
You do call optimizer[i].zero_grad()
(or the model version) at every iteration during your epoch right? You can check this discussion if you do not: Why do we need to set the gradients manually to zero in pytorch?
Otherwise, it looks good !