Simple question about CIFAR10 dataset in torchvision.datasets module

I’m following the PyTorch CNN example that uses the CIFAR10 dataset here:

Training a Classifier — PyTorch Tutorials 2.0.1+cu117 documentation

The shape of the CIFAR10 training data seems really odd.

Here is the size of the images: torch.Size([4, 3, 32, 32])
Here is the size of the labels: torch.Size([4])

I understand the images are 32x32 pixels with 3 channels. What I don’t understand is the 4 in the images dimensions. Why are the input image sizes torch.Size([4, 3, 32, 32]) instead of
torch.Size([3, 32, 32])??

Lastly, why isn’t the label simply the index of the correct class? Why is it 4 class numbers?


P.S. You can print the CIFAR10 data sizes with this code:

import torch
import torchvision
import torchvision.transforms

trans = [torchvision.transforms.ToTensor(),
         torchvision.transforms.Normalize(3 * [0.5], 3 * [0.5])]
trans = torchvision.transforms.Compose(trans)
train = torchvision.datasets.CIFAR10(root      = "training_data",
                                     download  = True,
                                     train     = True,
                                     transform = trans)
loader_train =,
                                           batch_size = 4,
                                           shuffle    = True)

for e in loader_train:
           print(len(e), e[0].shape, e[1].shape)

Hello Seberino,
this is nothing specific to CIFAR10 and would look very similar if you had used another dataset. If you use a DataLoader usually the batch dimension is the first (0-th) dimension that apperars. In your case you picked a batch size of 4, in other words, your dataloader loads 4 images at once.


Oh thanks! That totally makes sense! So the first input size dimension is the
batch size.

And why aren’t the labels just single numbers? It appears the labels
are the “top 4” answers? That seems odd.


Yes, exactly, it is the batch size. I suppose that it returns 4 labels which would match your batch size, note that you load 4 images at once which means there should be an equally large number of labels, i.e. 4. If you want to be sure please the print message of your code.

Thanks again. All my confusion was because of batches!
You cleared everything up!