I have 2 classification being trained back to back. when I do loss back in the second model, I get a error asking me to put retain_graph=True on the previous call. Which does not make sense to me.
the training schematics is below

model1.train()
model2.train()
for i, data in enumerate(dataset):
for _ in range(100):
x=model1(data[0])
loss1=loss_fn1(x,data[1])
loss1.backward()
opt1.step()
opt1.zero_gradient()
y=model2(data[2],x)
loss2=loss_fn2(y,data[3])
loss2.backward()
opt2.step()
opt2.zero_gradient()

The loss2 backward is the problem. How can I overcome the problem?

y depends on x - the output of model1.
If you call loss2.backward() Autograd tries to calculate the gradients using the computation graph.
Since model2 and model1 are connected through x, Autograd also tries to calculate the gradients again for the parameters in model1.
So save memory, the intermediate activations in model1 (which are necessary to compute the gradients) will be cleared after the loss1.backward() call.

If you don’t want the gradients w.r.t. loss2 in model1, you should detach x:

y = model2(data[2], x.detach())

Otherwise, you should set retain_graph=True so that the intermediates will be kept alive.