TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'NoneType'>

torch.manual_seed(101)
num_epochs=30
for epoch in tqdm(range(num_epochs)):
    for step,i in enumerate(train_loader):
        img,label_one_hot,label = i
        img = Variable(img).cuda()
        label_one_hot= Variable(label_one_hot.float()).cuda()
        pred = cnn_model(img)
        loss_value = loss_fn(pred,label_one_hot)
        optimizer.zero_grad()
        loss_value.backward()
        optimizer.step()
        print('epoch:', epoch+1, 'step:', step+1, 'loss:', loss_value.item())

This is my training loop

class Mydataset(Dataset):
    def __init__(self,path,is_train=True,transform=transform):
        self.path = path
        if is_train: self.img = os.listdir(self.path)
        self.transform = transform
    def __getitem__(self,idx):
        img_path = self.img[idx]
        files = os.listdir(self.path)
        for file in files:
            if (file.endswith(".jpg") and file[:-4] == length_of_captcha):
                img = cv2.imread(img_path,cv2.IMREAD_GRAYSCALE)
                kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(5,12))
                img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
                label = Path(self.path/img_path).name[:-4]
                label_ = []
            else:
                continue
            for i in label:
                label_ += encode(i)
            if self.transform is not None:
                img = self.transform(img)
            return img,np.array(label_),label
    def __len__(self):
        return (len(self.img))

Here I am trying for character recognition for 6 digit captcha image after some preprocessing. I am getting the error message in the title
This is my dataset class and here I have passed successfully creating train_loader. But I am still encountering this and I don’t know what is going wrong. Can anyone help me?

Based on the error message I guess that some return values are empty.
Could you add a condition into __getitem__ before the return statement and check if img, label_, and label have valid values?

3 Likes

Hello there, i am new to PyTorch and get the same error from my dataloader. I am not sure if i should open a new discussion or post it here?
Anyway, @ptrblck by check if the variables have valid values, you mean to print their shape?
Many thanks for any help already in advance!

The error message claims that NoneTypes are found in the default_collate, so you would need to check why some samples are loaded as None.

Ok, i found the bug. Thanks for your help.