Loss propagation of outputs of two mdoels

abbasi7 · April 5, 2022, 1:52am

Hello

I am traiing an unsupervised network. I have two modes and two outputs in supervise part of network.

output1=model1
Loss1
output2=model2
Loss2

For the un-supervised i am using the output of above two models, and plug them in an equation and estimate output3, and find its loss3

output3= f(output1,output2)
Loss3

Now my question is how to backpropagate loss3, which is based on the above two models.

thanks

mxahan · April 5, 2022, 4:58am

Is your case similar to this?

abbasi7 · April 5, 2022, 5:04pm

Thanks for the response. But I do not think is quite what I like to have. I am more interested in how loss3(in my case), is backpropagated to two models. The third output of the function(not the model), takes in the 2 outputs of two models.

What should I do here:

Final Loss=(Loss1+Loss3)+(Loss2+Loss3)
2.Final Loss=(Loss1+Loss3)+Loss2
Final Loss=(Loss2+Loss3)+Loss1
Final Loss=Loss1+Loss2+Loss3(using optim of Loss1 or Loss2)

thanks

mxahan · April 5, 2022, 6:29pm

Oh, thanks for the clarification.

Can your problem be thoughts from a metric learning perspective? e.g. Let’s assume, there is a text and corresponding image. We have a CNN model to encode images and we have a Language model for the texts. And our goal maybe is to maximize the model’s agreement in the final layers.

In that case, assuming using a euclidean distance (L2) loss between the model’s output. The CNN model will see the language model output as a constant target and update the weights as if it is trying to get close to the LM outputs. The vice-versa is true for the LM too.

It is generalizable for any other loss, instead of providing a fixed target, we are generating a target from another model. The baseline model has no idea where is the output coming from.

abbasi7 · April 5, 2022, 7:49pm

Thanks for your response. In my case the target is not generated from the model. The target for the supervised part of the code are fixed and known.

For the unsupervised part(where we do not have the target to estimate loss directly-thus using unsupervised), we plug in the outputs(from model 1 and model 2) in a formula. If the outputs1 and outputs2 are correct it should give me correctoutput3, for which i have target to estimate loss.

Actually I am estimating output1 , that satisfies the loss for output3. But output3 according to the formula depends on otput1 and output2. I hope I was able to clarify.

mxahan · April 5, 2022, 8:07pm

It’s a bit confusing for me. Can you please provide more about the optimizers, loss functions, and models?