I am implementing the segmentation as the pipeline below. How can I concatenate 2 Unet as the same pipeline, and training together (without training 2 models separately)? The training/inference is the same pipeline.
For loss function: I have 3 loss function for Unet1 (loss_1), Unet2 (loss_2), loss = loss_1 + loss_2
I’m not sure that I understand your use case, but perhaps you are asking for
something like the following:
Let me assume that the input to your first U-Net is a three-channel (RGB) image,
and that its output is a single “probability” per pixel, so one channel. You want
to take the output of the first U-Net an use it, together with another RGB image
(or perhaps the same RGB image), as input to the second U-Net. Last, let me
assume that you want the “predictions” output by the second U-Net to be a single
value per pixel.
Then your first U-Net should have three input channels and one output channel.
Your second U-Net should have, however, four input channels (and one output
channel). If your second U-Net is based on a three-input-channel U-Net, you will
have to modify its first convolutional layer to use four input channels, rather than
three.
Let output1 have shape [nBatch, 1, height, width] and be the output of
the first U-Net. Let the second RGB2 have shape [nBatch, 3, height, width].
Then torch.cat ((output1, RGB2), dim = 1) will have four channels (thus
shape [nBatch, 4, height, width]) and be suitable as input to the second
four-input-channel U-Net.
You can certainly backpropagate using a combined loss that is the sum of two
terms, one that depends only on the output of the first U-Net and a second that
depends on the output of the second U-Net. Doing so will populate the gradients
of the parameters of the two U-Nets in the way I believe you want. You can then
run an optimizer that contains both U-Nets’ parameters in its parameter list.