Building custom dataset, how to return ids as well?

justusschock · August 11, 2018, 8:15pm

While trying out my ideas, I think I came up with a very pytorch-like way:

Simply define a new collate_fn which calls the default_collate :
from torch.utils.data.dataloader import default_collate 


def id_collate(batch):
    new_batch = []
    ids = []
    for _batch in batch:
        new_batch.append(_batch[:-1])
        ids.append(_batch[-1])
    return default_collate(new_batch), ids
This could work If you pass it to your loader, since the ids (which have to be the last item in the tuple returned in the datasets __getitem__ will be removed before calling the default collate and simply returned afterwards.

@isalirezag reported this to work great.