Fusing Additional input/features for Transfer Learning

Great to see that your data loading pipeline seems to work.

I’m a bit skeptical about your forward implementation.
Currently you are reusing self.cnn.fc. The image tensor batch will use it once in x1 = self.cnn(image) and then again the concatenated tensor in x = self.cnn.fc(x).
Is this your workflow (it might make sense) or would you rather concatenate the penultimate resnet18 output of the image tensor with your additional data?

In the latter case, replace the last linear layer with self.cnn.fc = nn.Identity() and define the new classification block as self.fc = nn.Sequential(...).
If you are using this approach you won’t need to pad your additional feature tensor to an image size, and can just concatenate it in the feature dimension (dim1) as is already done in your code.

Take care of the scaling of the additional features, as we’ve seen some issue with this approach in the past, when the value ranges are quite different.
E.g. if x1 has values in [0, 1], while x2 has values in [0, 100], the last block might just “focus” on the larger values and treat the smaller ones as noise.

Since you are now expecting three output tensors in your DataLoader, use:

for ii, (data1, data2, target) in enumerate(train_loader):

and feed both data tensors to the model via model(data1, data2).