Dataloader error after using torch.utils.data.ConcatDataset

I am using torch.utils.data.ConcatDataset to merge 7 different datasets and I am getting this error in dataloader.py file.

The error happens at this line:

for imgs, target, word_str, weight in test_loader:

ā€¦

def default_collate(batch):
...
...
    numel = sum([x.numel() for x in batch])
    AttributeError: 'numpy.ndarray' object has no attribute 'numel'

Clearly, the code expects a tensor format. All my images have tensor format, but the multihot encoding are in numpy. However, when I try the same code it works well when I merge 6 datasets. What is happening?

When I try the seventh data set alone, it also works well; sampler is not used here.

Any ideas what is happening?

NB. The test_loader is based on a Sampler

def get_sampled_loader(cf, test_set):
        no_of_samples  = len(test_set)
        sample_idx = np.random.permutation(np.arange(0, no_of_samples))[:cf.no_of_sampled_data]                     
        if len(sample_idx) ==0:  
            exit('exiting function get_the_sampler(), sample_idx size is 0')    
        my_sampler = torch.utils.data.sampler.SubsetRandomSampler(sample_idx)  
        test_loader = torch.utils.data.DataLoader(test_set, batch_size=cf.batch_size_test,
                          shuffle= False, num_workers=cf.num_workers, sampler=my_sampler)
        return test_loader

Update: Removing the sampler did not resolve the error.

The targets for the six dataset seem to be defined as tensor, while the seventh dataset is numpy. The problem was resolved after converting the target of the seventh dataset to the tensor.

2 Likes