ValueError: only one element tensors can be converted to Python scalars bug

I am making dataloaders of a dataset of 10560 images, of them 1000 are for test, and the bigger part is divided into train and val in 85:15 ratio.
Batchsize is 100
The code is:

class CustomDataset(Dataset):
    def __init__(self, samples, transform = None):
        self.transform = transform
        self.samples = samples

    def __getitem__(self, index):
        s1, s2, t = self.samples[index]
        s1 = imread(os.path.join(current_path, string, s1))
        s2 = imread(os.path.join(current_path, string, s2))
        s1 = gray2rgb(s1)
        s2 = gray2rgb(s2)

        if self.transform:
            s1 = self.transform(s1)
            s2 = self.transform(s2)

        t = torch.tensor(t)
        return (s1, s2, t)

    def __len__(self):
        return len(self.samples)


train_dataset = CustomDataset(trainlist, transform=transform)
test_dataset = CustomDataset(testlist, transform=transform)

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=4, drop_last=False)
test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=4, drop_last=False)

And I am just trying

for e, (inputs1, inputs2, labels) in enumerate(train_dataloader):
    print(e)

to test the code. But I am getting this error:

ValueError: only one element tensors can be converted to Python scalars

I must say, before this, I was working with a similar but smaller dataset and the code was same. There I wasn’t getting this error.
Can anyone help?

Could you add random tensor initializations to create an executable code snippet, which would reproduce the issue so that we could debug it?

1 Like

So, there are 9560 images that I am using in trainval in 85:15 ratio. As we can see, I am trying to run the traindataloader, so it is the 85% of 9560, so 8126 images.
The shapes of them are [1, 400, 400] (binary images), so probably you can try torch.rand here.
The transform is:

transform = transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomHorizontalFlip(),
        transforms.RandomVerticalFlip(),
        # transforms.Resize(resize),
        transforms.CenterCrop(800),
        transforms.RandomResizedCrop(400),
        transforms.ToTensor(),
        transforms.Normalize(mean=train_mean, std=train_std)
    ])

Original images were larger than 800 and had different sizes. So I used centercrop and randomresize here (I had to do centercrop to ensure I was taking image part from almost the middle of the image).

Thanks for the update. I cannot reproduce the issue using the provided code:

class CustomDataset(Dataset):
    def __init__(self, samples, transform = None):
        self.transform = transform
        self.samples = samples

    def __getitem__(self, index):
        s1, s2, t = self.samples[index]

        if self.transform:
            s1 = self.transform(s1)
            s2 = self.transform(s2)

        return (s1, s2, t)

    def __len__(self):
        return len(self.samples)


samples = [[torch.randn(3, 400, 400), torch.randn(3, 400, 400), torch.randint(0, 10, (1,))]
           for _ in range(1000)]

transform = transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomHorizontalFlip(),
        transforms.RandomVerticalFlip(),
        transforms.CenterCrop(800),
        transforms.RandomResizedCrop(400),
        transforms.ToTensor(),
        transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
    ])

train_dataset = CustomDataset(samples, transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=100, shuffle=True, num_workers=4, drop_last=False)

for x1, x2, t in train_dataloader:
    print(x1.shape, x2.shape, t.shape)
1 Like

Maybe you can try samples = torch.rand(8126, 3, 900, 900)
Apologies as the total size becomes very high!