TL;DR
Accumulate gradient over multiple batches on GANs with low memory.
Hi everybody,
I would like to implement gradient updates over multiple minibatches (as described by @albanD in https://discuss.pytorch.org/t/why-do-we-need-to-set-the-gradients-manually-to-zero-in-pytorch/4903/18) in a GAN.
I would like to call optimizerG.step(), let’s say, only every 4 batches and accumulate gradients for the generator as described in the albanD second example in the answer at the link above.
In DCGAN example https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html#training we call for every batch the netG.zero_grad() just before the update of the generator and this prevents the gradient accumulation.
Any way I can deal with it?