Hi guys,
I have an Encoder(E) and a generator(G). One of the layers (Z) in b/w is stochastic.
The loss for both the modules is different. How can I update their weights? Do you think retain_graph=True
is relevant here?
Do you think the following approach will work:
- Have two optimizers one each for E and G.
- Compute
Z
andLoss_G
. UseLoss_G.backward()
to setG.grad
. Here as of yet, I do not update the G parameters. - Set the E.grad = 0 using
E_optimizer.zero_grad()
. UseE_Loss.backward()
to get correct E.grad - Then update both E and G weights simultaneously.
In step 3, how can I ensure that while doing E_loss.backward()
, G.grad
is not touched? Can we exclude nodes like that while doing a particular backprop?
Thanks