I’m following the PyTorch CNN example that uses the CIFAR10 dataset here:
The shape of the CIFAR10 training data seems really odd.
Here is the size of the images: torch.Size([4, 3, 32, 32])
Here is the size of the labels: torch.Size()
I understand the images are 32x32 pixels with 3 channels. What I don’t understand is the 4 in the images dimensions. Why are the input image sizes torch.Size([4, 3, 32, 32]) instead of
torch.Size([3, 32, 32])??
Lastly, why isn’t the label simply the index of the correct class? Why is it 4 class numbers?
P.S. You can print the CIFAR10 data sizes with this code:
import torch import torchvision import torchvision.transforms trans = [torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(3 * [0.5], 3 * [0.5])] trans = torchvision.transforms.Compose(trans) train = torchvision.datasets.CIFAR10(root = "training_data", download = True, train = True, transform = trans) loader_train = torch.utils.data.DataLoader(train, batch_size = 4, shuffle = True) for e in loader_train: print(len(e), e.shape, e.shape)