Combining two input images and tabular data

Hi everyone,

I’m a beginner with PyTorch and doing my first DL project.

I have created my own dataset, which is made of a collection of:

  • one image
  • another image
  • x-coordinate location
  • y-coordinate location

I want to “combine” those 4 data points in one, so that I can feed that my neural net. In the end I want to predict x, y locations as well, so this is a multi-output regression problem if I’m correct.

What would be the approach to combine the 4 data points in one so I can feed to a CNN?


To “combine” values you could either use or torch.stack, which will concatenate the subtensors in a specified dimension or a new one.
I’m not sure, if this is what you are looking for exactly.