Which way is correct for optimize dcgan?

In sample of dcgan, it done as

optimize_D.zeros()
loss_real=compute_loss_real()
loss_real.backward()

loss_fake=compute_loss_fake()
loss_fake.backward()
optimize_D.step()

I also find another implementation likes

optimize_D.zeros()
loss_real=compute_loss_real()
loss_fake=compute_loss_fake()
loss_total=(loss_real+loss_fake) /2.0
loss_total.backward()
optimize_D.step()

Which is correct way? Do we have any benefit using the second way? My code give a small gain using the second way

The reference for the first way: https://github.com/pytorch/examples/blob/master/dcgan/main.py#L217
The reference for the second way: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/blob/master/models/pix2pix_model.py#L79
Thanks

hi

Until optimizer.zero_grad() is called, gradients are accumulated.
So, in general, 2 imple.s would work the same way, but the scale of gradients would be different.

You means they may provide different solution, because you said “the scale of gradients would be different.”?

Yes.

In 2nd way, each parameter’s each update step will be smaller and this might help the training stabilizes, I’m not sure.