Is there something wrong to learn multiple models with the same batch simultaneously?

LZY · August 22, 2019, 6:01am

I did something like this:
data = data.to(device)
label = label.to(device)
out1 = model1(data, label)
loss1 = model1.loss
loss1.backward()
out2 = model2(data, label)
loss2 = model2.loss
loss2.backward()

Model1 and model2 are the same models. With the codes above, model2 was not learned correctly at all.
But if I duplicate the batch into GPU twice and learn model1 and model2 with them respectively, both model1 and model2 are gonna be ok.

I think once the backward of loss1 was done, the computational graph would be freed and have no effects on the following computations.
Is there anyone can explain it?

ptrblck · August 22, 2019, 12:00pm

Are you manipulating data inplace in model1 somehow, so that model2 gets a different data distribution?