Joint Model Training

I have two torch models that are trained separately. My goal is to jointly train them instead of fully train the first model, save the output and then load it into the second model.

Every model has its own loss, the first model loss is kinda complicated because it’s an average of three different losses - Three joint tasks - (You may choose to ignore this detail). So I am not sure how should I approach this, is it better to drop the first model’s losses and only backpropagate the seconds? If not, how can one initialize backpropagation from Two losses in one model?

I am also aware that I can use something like learning rate control to speed up the first model learning while slowing down the second’s in the beginning and switch as well as forcing ground-truth in the early epochs. Appreciate your help!