I’m following the PyTorch CNN example that uses the CIFAR10 dataset here:
Training a Classifier — PyTorch Tutorials 2.0.1+cu117 documentation
The shape of the CIFAR10 training data seems really odd.
Here is the size of the images: torch.Size([4, 3, 32, 32])
Here is the size of the labels: torch.Size([4])
I understand the images are 32x32 pixels with 3 channels. What I don’t understand is the 4 in the images dimensions. Why are the input image sizes torch.Size([4, 3, 32, 32]) instead of
torch.Size([3, 32, 32])??
Lastly, why isn’t the label simply the index of the correct class? Why is it 4 class numbers?
Thanks.
P.S. You can print the CIFAR10 data sizes with this code:
import torch
import torchvision
import torchvision.transforms
trans = [torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(3 * [0.5], 3 * [0.5])]
trans = torchvision.transforms.Compose(trans)
train = torchvision.datasets.CIFAR10(root = "training_data",
download = True,
train = True,
transform = trans)
loader_train = torch.utils.data.DataLoader(train,
batch_size = 4,
shuffle = True)
for e in loader_train:
print(len(e), e[0].shape, e[1].shape)