I am making dataloaders of a dataset of 10560 images, of them 1000 are for test, and the bigger part is divided into train and val in 85:15 ratio.
Batchsize is 100
The code is:
class CustomDataset(Dataset):
def __init__(self, samples, transform = None):
self.transform = transform
self.samples = samples
def __getitem__(self, index):
s1, s2, t = self.samples[index]
s1 = imread(os.path.join(current_path, string, s1))
s2 = imread(os.path.join(current_path, string, s2))
s1 = gray2rgb(s1)
s2 = gray2rgb(s2)
if self.transform:
s1 = self.transform(s1)
s2 = self.transform(s2)
t = torch.tensor(t)
return (s1, s2, t)
def __len__(self):
return len(self.samples)
train_dataset = CustomDataset(trainlist, transform=transform)
test_dataset = CustomDataset(testlist, transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=4, drop_last=False)
test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=4, drop_last=False)
And I am just trying
for e, (inputs1, inputs2, labels) in enumerate(train_dataloader):
print(e)
to test the code. But I am getting this error:
ValueError: only one element tensors can be converted to Python scalars
I must say, before this, I was working with a similar but smaller dataset and the code was same. There I wasn’t getting this error.
Can anyone help?
Could you add random tensor initializations to create an executable code snippet, which would reproduce the issue so that we could debug it?
1 Like
So, there are 9560 images that I am using in trainval in 85:15 ratio. As we can see, I am trying to run the traindataloader, so it is the 85% of 9560, so 8126 images.
The shapes of them are [1, 400, 400] (binary images), so probably you can try torch.rand here.
The transform is:
transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(),
transforms.RandomVerticalFlip(),
# transforms.Resize(resize),
transforms.CenterCrop(800),
transforms.RandomResizedCrop(400),
transforms.ToTensor(),
transforms.Normalize(mean=train_mean, std=train_std)
])
Original images were larger than 800 and had different sizes. So I used centercrop and randomresize here (I had to do centercrop to ensure I was taking image part from almost the middle of the image).
Thanks for the update. I cannot reproduce the issue using the provided code:
class CustomDataset(Dataset):
def __init__(self, samples, transform = None):
self.transform = transform
self.samples = samples
def __getitem__(self, index):
s1, s2, t = self.samples[index]
if self.transform:
s1 = self.transform(s1)
s2 = self.transform(s2)
return (s1, s2, t)
def __len__(self):
return len(self.samples)
samples = [[torch.randn(3, 400, 400), torch.randn(3, 400, 400), torch.randint(0, 10, (1,))]
for _ in range(1000)]
transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(),
transforms.RandomVerticalFlip(),
transforms.CenterCrop(800),
transforms.RandomResizedCrop(400),
transforms.ToTensor(),
transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
])
train_dataset = CustomDataset(samples, transform=transform)
train_dataloader = DataLoader(train_dataset, batch_size=100, shuffle=True, num_workers=4, drop_last=False)
for x1, x2, t in train_dataloader:
print(x1.shape, x2.shape, t.shape)
1 Like
Maybe you can try samples = torch.rand(8126, 3, 900, 900)
Apologies as the total size becomes very high!