@hm2092 If it was just the COCO images, then you are correct that the default collate function would work fine. The need for a custom one is to handle the captions. The __getitem__
method of the built-in CocoCaptions
dataset class returns the images and a list of strings, so my collate function converts the strings to word indices and pads them to equal length so that torch.stack
will work
This is with pytorch 1.4.0 installed through conda, so I may try downgrading to 1.2 or 1.3
Edit: Found out my issue was passing an out-of-bounds index (-1 in my case) to nn.Embedding
. Previous discussion from other users around embedding-related errors can be found on this thread