Ignoring Channels in Loss for Transfer Learning

I would like to transfer learn (or perhaps I should say combine) one vision problem to another similar problem. For the first problem, I have segmentations for the liver and spleen (Dataset 1), and in the second problem, I have segmentations only for the kidneys (Dataset 2).

I was planning on having 4 channels: Liver, Spleen, Right_Kidney, Left_Kidney. I wanted to utilize a model trained on Dataset 1 ignoring the kidney class, and then train this model on Dataset 2 incorporating the kidney data. (Knowledge of the Liver and Spleen is certainly helpful to spatially locate kidneys so the added data should ideally enhance segmentation of the kidneys).

If anyone has a reference on combining problems or ignoring channels in the loss function selectively, please let me know!