Training multiple models on one GPU simultaneously

albanD · October 17, 2019, 6:38pm

You do call optimizer[i].zero_grad() (or the model version) at every iteration during your epoch right? You can check this discussion if you do not: Why do we need to set the gradients manually to zero in pytorch?

Otherwise, it looks good !