Hello, everyone,
Since I have to deal with some variable-length inputs, I knew the following codes should work (input1 and input2 have the same feature dimension but different length)
loss1 = LossFunction(model(input1), target1)
loss1.backward()
loss2 = LossFunction(model(input2), target2)
loss2.backward()
However, I want to do some “batch-like” operations, and reduce the times of backward, then the codes looks like the following:
loss1 = LossFunction(model(input1), target1)
loss2 = LossFunction(model(input2), target2)
loss = loss1 + loss2
loss.backward()
Though no error was reported, it seems to be problematic. My question is that whether the temporal outputs saved in the forward pass for input1 will be overwritten by the ones for input2, which means the gradients will be computed only against input2 will the loss is computed over input1 and input2.
Thanks,
Shuai