TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'imageio.core.util.Array'>

Godyaochang · December 1, 2019, 5:00pm

Hello Everyone!

I am rather new to PyTorch and I am trying to implement a previous project I had in TF in pytorch.
While testing my code so far I get the following error message:

raceback (most recent call last):
File “train.py”, line 302, in
main()
File “train.py”, line 265, in main
train_log = train(args, train_loader, model, criterion, optimizer, epoch)
File “train.py”, line 119, in train
for i, (input, target) in tqdm(enumerate(train_loader), total=len(train_loader)):
File “D:\Anaconda1\envs\pytorch\lib\site-packages\tqdm\std.py”, line 1081, in iter
for obj in iterable:
File “D:\Anaconda1\envs\pytorch\lib\site-packages\torch\utils\data\dataloader.py”, line 346, in next
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File “D:\Anaconda1\envs\pytorch\lib\site-packages\torch\utils\data_utils\fetch.py”, line 47, in fetch
return self.collate_fn(data)
File “D:\Anaconda1\envs\pytorch\lib\site-packages\torch\utils\data_utils\collate.py”, line 79, in default_collate
return [default_collate(samples) for samples in transposed]
File “D:\Anaconda1\envs\pytorch\lib\site-packages\torch\utils\data_utils\collate.py”, line 79, in
return [default_collate(samples) for samples in transposed]
File “D:\Anaconda1\envs\pytorch\lib\site-packages\torch\utils\data_utils\collate.py”, line 81, in default_collate
raise TypeError(default_collate_err_msg_format.format(elem_type))
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘imageio.core.util.Array’>

This error comes when the following function I defined for training is called:

class Dataset(torch.utils.data.Dataset):

transform = transforms.Compose([
# you can add other transformations in this list
transforms.ToTensor()])

def __init__(self, args, img_paths, mask_paths, aug=False):
    self.args = args
    self.img_paths = img_paths
    self.mask_paths = mask_paths
    self.aug = aug

def __len__(self):
    return len(self.img_paths)

def __getitem__(self, idx):
    img_path = self.img_paths[idx]
    mask_path = self.mask_paths[idx]
    image = imread(img_path)
    mask = imread(mask_path)

    image = image.astype('float32') / 255
    mask = mask.astype('float32') / 255

    if self.aug:
        if random.uniform(0, 1) > 0.5:
            image = image[:, ::-1, :].copy()
            mask = mask[:, ::-1].copy()
        if random.uniform(0, 1) > 0.5:
            image = image[::-1, :, :].copy()
            mask = mask[::-1, :].copy()

    image = image.transpose((2, 0, 1))
    mask = mask[:,:,np.newaxis]
    mask = mask.transpose((2, 0, 1))

    return image, mask

Can someone of you help me in understanding why this error occurs and why it seems that the Dataloader applied on my data seems to return an object and not a tensor?

I am sorry if the question might seems obvious or dumb but as I said I am new to PyTorch and I am trying to improve!

Thanks a lot in advance!

albanD · December 1, 2019, 11:28pm

From a quick look at your code, it seems like your image and mask are the return type of imread which are imageio Arrays.
As the error message states, the dataloader cannot handle such type.
You will need to convert it either to a numpy array or a torch Tensor before returning them from your Dataset __getitem__ function.