my models can be described as below:
optim_a(model_a.parameters())
optim_b(model_b.parameters())
optim_a.zero_gard()
optim_b.zero_gard()
input_a -> model_a -> output_a -> loss_a = cross_entropy(output_a, label_a)
output_a -> model_b-> output_b -> loss_b = cross_entropy(output_b, label_b)
loss = loss_a + loss_b
loss.backward()
optim_a.step()
optim_b.step()
Does loss_b have influence on model_a’s parameters because I use the output of model_a as the input of model_b? or the gradient information lost when I use the output of the previous model as the input of next model?
I’m a newbie of pytorch, looking forward to your help!