Thanks for the response. But I do not think is quite what I like to have. I am more interested in how loss3(in my case), is backpropagated to two models. The third output of the function(not the model), takes in the 2 outputs of two models.

What should I do here:

Final Loss=(Loss1+Loss3)+(Loss2+Loss3)
2.Final Loss=(Loss1+Loss3)+Loss2

Final Loss=(Loss2+Loss3)+Loss1

Final Loss=Loss1+Loss2+Loss3(using optim of Loss1 or Loss2)

Can your problem be thoughts from a metric learning perspective? e.g. Let’s assume, there is a text and corresponding image. We have a CNN model to encode images and we have a Language model for the texts. And our goal maybe is to maximize the model’s agreement in the final layers.

In that case, assuming using a euclidean distance (L2) loss between the model’s output. The CNN model will see the language model output as a constant target and update the weights as if it is trying to get close to the LM outputs. The vice-versa is true for the LM too.

It is generalizable for any other loss, instead of providing a fixed target, we are generating a target from another model. The baseline model has no idea where is the output coming from.

Thanks for your response. In my case the target is not generated from the model. The target for the supervised part of the code are fixed and known.

For the unsupervised part(where we do not have the target to estimate loss directly-thus using unsupervised), we plug in the outputs(from model 1 and model 2) in a formula. If the outputs1 and outputs2 are correct it should give me correctoutput3, for which i have target to estimate loss.

Actually I am estimating output1 , that satisfies the loss for output3. But output3 according to the formula depends on otput1 and output2. I hope I was able to clarify.