Hi everyone, I have an issue with the behaviour of the torch.utils.data.DataLoader
for a minimal custom torch.utils.data.Dataset
.
Here is the dataset
class MyData(Dataset):
'Characterizes a dataset for PyTorch'
def __init__(self, data, targets):
""""Initialization, here display serves to show an example
if false, it means that we intend to feed the data to a model"""
self.targets = targets
self.data = data
def __len__(self):
'Denotes the total number of samples'
return len(self.targets)
def __getitem__(self, index):
'Generates one sample of data'
# Select sample
X = self.data[index]
# Load data and get label
y = self.targets[index]
return X,y
But now, if I try to use it within a DataLoader, it doesn’t switch the number of channel to the second dimension of the tensor
Here is a minimal exemple:
train_data = datasets.CIFAR10('../data/cifar', train=True, transform=transforms.ToTensor(), download=False)
test_data = datasets.CIFAR10('../data/cifar', train=False, transform=transforms.ToTensor(), download=False)
source_data = MyData(train_data.data, train_data.targets)
source_loader = torch.utils.data.DataLoader(source_data, batch_size=1024)
c,d = next(iter(source_loader))
c.shape
It it will return shape
torch.Size([1024, 32, 32, 3])
Whereas, if I do it directly the cifar10 dataset. For instance
train_loader = DataLoader(train_data, batch_size=1024, shuffle=True)
a,b = next(iter(train_loader))
a.shape
will return
torch.Size([1024, 3, 32, 32])
I am using
- torch ‘1.9.0’
- torchvision ‘0.10.0’