Backprop through generator to update latent


I have a GAN-generator setup where I want to compute the loss of a generated image w.r.t. the true image and then backpropagate that loss to update the input vector. The relevant code would look something like this:

for epoch in range(100):
    for img, latent in dataset:
        gen_img = generator(latent)
        loss = loss_func(true_img, gen_img)
        loss.backward(), alpha= -learning_rate)

My query is: does it matter if I zero out the gradients of the generator or not? My guess is: gradients are accumulated, so at every iteration, these accumulated grads in the generator are used in the chain rule to compute the derivative w.r.t. latent. So the latent will get updated incorrectly.

Please correct me if this guess is incorrect.


I don’t think that’s the case since the gradients won’t be included in the next gradient calculation unless you manually add them to the computation graph. To verify it, you could compare a run with and without gradient accumulation using the same inputs and making sure the model is in eval() mode to disable potential non-deterministic output via e.g. dropout layers.

@ptrblck Thanks for the response! So if I understand correctly, gradients of the computation graph of generator are cleared before every iteration automatically?

I also came across this (link):

PyTorch uses a dynamic graph. That means that the computational graph is built up dynamically, immediately after we declare variables. This graph is thus rebuilt after each iteration of training.

Is this the reason? I am planning to run the 2 experiments soon, but just for sake of understanding, it would be good to know how exactly the computation graphs and gradients work :slight_smile: