Empty label when loading mscoco data using dataloader

mderakhshani · April 17, 2017, 7:49pm

Hi
When I loaded MSCOCO Detection data with below command:

det = dset.CocoDetection(root='./train2014',
annFile = ‘./annotations/instances_train2014.json’,
transform = trans.Compose([trans.Scale([448,448]),
trans.ToTensor(),
trans.Normalize((.5,.5,.5),(.5,.5,.5))]))
trainLoader = torch.utils.data.DataLoader(det, batch_size=16, num_workers=2)
trainItr = iter(trainLoader)
images, labels = trainItr.next()

I think everything is well but not about labels value. I received an empty labels variable when using trainItr.next. Here is the printed value of variable lables:

[[(‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’, ‘image_id’), (‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’, ‘iscrowd’), (‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’, ‘category_id’), (‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’, ‘segmentation’), (‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’, ‘area’), (‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’, ‘id’), (‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’, ‘bbox’)]]

How can I solve this?

fmassa · April 17, 2017, 8:04pm

Normally the COCO dataset uses the official loaders provided by the COCO dataset, so if there is a problem with it, it might be that your data is not exactly in the format provided by the dataset.
Also, to make your life easier to debug, you don’t need to call next on the dataloader, but just use the dataset and index it

images, labels = det[0] # idx of your img, here 0

mderakhshani · April 17, 2017, 8:07pm

Thanks for your response. this command I mean images, labels = det[0] works for me, but I would like to work with torch.utils.data.DataLoader. So any way, I just want to report this problem to pytorch developer.

fmassa · April 17, 2017, 8:11pm

You need to write your own collate_fn in this case, that specifies how the list of targets will be joined together, as the default one doesn’t handle the case you need.

mderakhshani · April 17, 2017, 8:12pm

Okay. I appreciate it. Thank you!

fmassa · April 17, 2017, 8:12pm

For reference, here is how the default_collate is implemented in pytorch dataloader.
You will need to update it to handle your specific case