Hi There,
I’m working on training an Invertible Neural Network (see https://arxiv.org/abs/1808.04730 / https://github.com/VLL-HD/FrEIA), which basically is a neural network that you can run both in the forward and in the reverse direction. For each training batch, I run the network in the forward direction, then calculate multiple losses and then run the network in reverse, and calculate multiple losses.
I’m posting to get help with when to call .backward() on my losses.
I want to only run one parameter update per batch, so not after the forward, and reverse step. Simplified down, my current program runs like:
def training_step(x, y):
y_hat = f(x)
forward_loss1 = floss1(y_hat, y)
forward_loss2 = floss2(y_hat, y)
l_forward = forward_loss1 + forward_loss2
l_forward.backward()
...
x_hat = f(y, reverse=True)
reverse_loss1 = rloss1(x_hat, x)
reverse_loss2 = rloss2(x_hat, x)
l_rev = reverse_loss1 + reverse_loss2
l_rev.backward()
optimizer.step()
Questions:
- Is this a correct way to train? would it instead be better to run
l_total = reverse_loss1 + reverse_loss2 + forward_loss1 + forward_loss2
and thenl_total.backward()
? - Does running the network the second time clear the gradient saved from the l_forward.backward()?
Thanks so much!