A problem about DataLoader

In my program, I read images from a folder (with 68 images), and I add the noise it. I trans them into DataLoader for NN. I found if I set batch_size=1, all is OK, while if batch_size=2, there are errors. I don’t know how to deal with it. Could someone help me?

import os
import torch
import pandas as pd
from skimage import io, transform, util
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms, utils

class imDataset(Dataset):
    """ my images dataset."""
    def __init__(self, csv_file, sigma=0.01, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            sigma: the noise variation
            transform (callable, optional): Optional transform to be applied on a sample.
        """
        self.imfnames = pd.read_csv(csv_file)
        self.transform = transform
        self.sigma = sigma

    def __len__(self):
        return len(self.imfnames)

    def __getitem__(self, idx):
        img_name = self.imfnames.ix[idx, 0]
        origin_img = io.imread(img_name).astype(np.float32)/255.0
        if self.transform:
            origin_img = self.transform(origin_img)
        noise_img = util.random_noise(origin_img, clip=True, var=self.sigma)

        origin_img = torch.unsqueeze(torch.from_numpy(origin_img), 0)
        noise_img = torch.unsqueeze(torch.from_numpy(noise_img),0)

        return origin_img, noise_img


denoise_dataset = imDataset(csv_file='./ori.csv', sigma = 0.03)

#
trainloader = DataLoader(denoise_dataset, batch_size=1, shuffle=True)

# # # Helper function to show a batch
for epoch in range(3):
    print('outer')
    i = 1
    for oo, dde in trainloader:
        print("in, this is", i)
        i = i+1

The errors are:

outer
in, this is 1
in, this is 2
Traceback (most recent call last):
  File "/home/matliu/PycharmProjects/pytortest/temp1.py", line 90, in <module>
    for oo, dde in trainloader:
  File "/home/matliu/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 190, in __next__
    batch = self.collate_fn([self.dataset[i] for i in indices])
  File "/home/matliu/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 110, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/matliu/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 110, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/matliu/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 92, in default_collate
    return torch.stack(batch, 0, out=out)
  File "/home/matliu/anaconda3/lib/python3.6/site-packages/torch/functional.py", line 58, in stack
    return torch.cat(inputs, dim)
RuntimeError: inconsistent tensor sizes at /b/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:2559

I knew if the sizes of images is the same, all it ok. otherwise not.

1 Like

It’s something related to batching. Most probably because all images are not of the same size.

Create transforms to make them constant size. You can checkout tutorial on transforms here: http://pytorch.org/tutorials/beginner/data_loading_tutorial.html

Thank you very much! It is an excellent example.