Index 1 is out of bounds for dimension 0 with size 1

I would like to visualize the number of image data for each class of the cifar10 dataset on a graph.I have tried implementing this on pytorch but I couldn’t get any satisfying results. Can anyone please help me to plot the CIFAR10 Dataset classes frequency .

Here is an exemple of an implementation i have tried but i keep getting an error when comparing labels with the classes indexes

def classes_freq():
    for images, labels in trainloader_classes:
          #print(labels.shape)
      
          images = images.to(device)
          labels = labels.to(device)
          count = [0]*10
          for k in range(len(trainloader_classes)):
            for i,e in enumerate(classes):
             if (labels[k].item() == i):
                count[i]+=1
    return count

classes_freq()


import matplotlib.pyplot as plt

def plot(y_units):

  x_units = ['plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

  plt.bar(x_units, y_units,width=0.8, color=['red','blue'])

  plt.xlabel('x - axis')
  plt.ylabel('y - axis')
  plt.title('classes frequency')

  plt.show()

#plot([10,20,23,50,60,70,80,90,10,45])

Error:

IndexError: index 1 is out of bounds for dimension 0 with size 1

You can read about the CIFAR10 class frequencies on the offical website (they are already well knowned).

It’s much better to iterate over torch.utils.data.Dataset directly instead of DataLoader (you don’t need for your task batching, parallelization and other utilities that this class provides).

The entire error’s stack trace would be helpful for us to find the error so I’ll give you my, much more performant solution:

train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                             download=True, transform=transform)

labels = torch.tensor([instance[1] for instance in train_dataset])
class_freq = labels.bincount()

classes = ['plane', 'car', 'bird', 'cat','deer', 'dog', 
           'frog', 'horse', 'ship', 'truck']

indexes = np.arange(len(classes))
width = 0.3
plt.bar(indexes, class_freq.numpy(), align='edge', width=width)
plt.xticks(indexes + width * 0.5, classes)
plt.show()

It’s handy to use torch.bincount to count the frequencies in one step.

A clean and really much more better solution ! Thank you .
But I really want to understand why the error occurred,and why exactly the index k is out of bounds? Is it because the first loop does not pass to the next image data in the trainloader_classes?

Oh I see it.

It seems that this line is the issue: if (labels[k].item() == i):. You are trying to index with 1 a tensor (labels) with shape [1] (it would work fine if the index was 0).

yes with the index 0 it work , I guess because this loop for k in range(len(trainloader_classes))is iterating over one image while it’s suppose to pass to the next label in the next iteration