Multi source multi target data loading and batching


I would like form batches with different tensors representing different modalities i.e. some images with NCHW ordering and some sentences represented as n_sentences, n_max_length . The sentences at some point will be looked up from an embedding layer. Both tensors will be used during the forward pass as different entry points.

The question is can I use DataLoader with a custom collate_fn for this or a DataLoader is solely for a single source of information?

Do you want to load the images simultaneously with the sentences?
If so, you could implement your own Dataset, which returns one image and the corresponding sentence for each iteration.

Right now at the same time yes. I already implemented a Dataset which returns a namedtuple with image and sentence inside but will this be enough for DataLoader to cope with it? In my first try, it seemed to pack the sentences inside object typed tensors since they where not padded and had different lengths. But for padding, we need to be at the point where we form the batches which is also handled by DataLoader. Thanks.

OK it’s all about the collate_fn() which did not know about namedtuple’s. Modified it to dict, image tensor is OK but the integer sentence arrays are messed/transposed. I think it would be better to just write a collate_fn suitable for the dataset.