Next(iter(dataloader)) error. kindly help

Hi,

upon create the dataloader, i try to iterate it ( image, labels = next(iter(dataloader)) ) to check the content and got the following error:

TypeError: pic should be PIL Image or ndarray. Got <class ‘torch.Tensor’>

For train data transform as followings:
train_transform = transforms.Compose(
[
transforms.Grayscale(3),
transforms.Resize((img_size,img_size)),
transforms.ToTensor(),
transforms.Normalize((0.1307,0.1307,0.1307),(0.3081,0.3081,0.3081))
]
)

i have created the image dataset as img_dataset with images and csv (image file name and label) using above train_transform
i then create split data by calling train_sampler = SubsetRandomSampler(…)
and dataloader = DataLoader(img_dataset, batch_size=64, sampler = train_sampler)

I understand that upon using train_transform, the images loaded will be convert to tensor. which is what i need before fitting to model. However, when I do the iteration, it states that it should be PIL Image or ndarray. i have checked on line for a while but all the sample codes from the web are similar to my code. why in my case, there is a strange error like this. and how can i solve it?

kindly advice.

My guess is that you already have the image as a tensor and try to perform the ToTensor() operation again.

Can you post how you create your dataset?

ToTensor() is working fine. the image is in jpg format.

torch.manual_seed(0)
train_transform = transforms.Compose(
[
transforms.Grayscale(3),
transforms.Resize((img_size,img_size)),
transforms.ToTensor(),
transforms.Normalize((0.5,0.5,0.5),(1.0,1.0,1.0))
]
)

class myImageDataset(Dataset):
def init(self,ann_file,img_dir,transform=None):
self.img_labels = pd.read_csv(ann_file)
self.img_dir = img_dir
self.transform = transform

def __len__(self):
    return len(self.img_labels)  #this create upper bound, the max number of instances

def __getitem__(self, idx):
    img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx,0])
    img = read_image(img_path)
    label = self.img_labels.iloc[idx,1]
    if self.transform:
        img = self.transform(img)
    return img,label

img_dataset = myImageDataset(ann_file = “train.csv”,
img_dir = “train_images/”,transform=train_transform)

dataset_size = len(img_dataset)
indices = list(range(dataset_size))
split = int(np.floor(validation_split*dataset_size))
if shuffle_dataset:
np.random.seed(random_seed)
np.random.shuffle(indices)

train_ind = indices[split:]
val_ind = indices[:split]

print(train_ind[45])
train_sampler = SubsetRandomSampler(train_ind)
valid_sampler = SubsetRandomSampler(val_ind)
train_loader = DataLoader(img_dataset, batch_size = batch_size, sampler = train_sampler)
valid_loader = DataLoader(img_dataset, batch_size = batch_size, sampler = valid_sampler)

img_,labels = next(iter(train_loader) #########>> this is where the error pop up

If you are using torchvision.io.read_image, the output is already a tensor. So when you call ToTensor in your transform in the __getitem__ method, that is why you get the error.

https://pytorch.org/vision/master/generated/torchvision.io.read_image.html

Could you print the type for the img variable before you do the transform? Just to be sure it is either PIL or ndarray?

thanks Matrias_Vasquez

1 Like