Creating csv to index data in torchvision.dataset.MNIST

I want to be able to select certain images from a torchvision dataset (MNIST in this case) by selecting a class at random and then a random image from this class. To do this quickly, I want to use a csv where for each image the following information is saved: (class_index, image_index, class_name).

The problem however right now is when I am trying to create this csv. For this I am using the following piece of code:

datasets = {x: MNIST(os.path.join(dataset_path, x), train=x=='train',
                transform=data_transforms[x], target_transform=None, download=True)
                for x in ['train', 'val']}
classes = datasets['train'].targets.unique()
if not os.path.isfile(dataset_name + ".csv"):
    df = pd.DataFrame()
    for klasse in classes:
        indices = datasets['train'].targets==klasse
        for idx in indices:
            df = df.append({'id': idx, 'name': klasse}, ignore_index = True)
            print("idx: ", idx, ", klasse: ", klasse, ", datasets['train'].data[idx]: ", datasets['train'].data[idx])
            # Is this idx linked to the image?? check this.
    df = df.sort_values(by = ['name', 'id']).reset_index(drop = True)
    df['class'] = pd.factorize(df['name'])[0]
    df.to_csv(dataset_name + ".csv", index = False)
    print("csv created")

This has not stopped running yet after 3 hours, and it prints something like the following (but then more often of course):

idx: tensor(0, dtype=torch.uint8) , klasse: tensor(0) , datasets[‘train’].data[idx]: tensor([], size=(0, 60000, 28, 28), dtype=torch.uint8)
idx: tensor(1, dtype=torch.uint8) , klasse: tensor(0) , datasets[‘train’].data[idx]: tensor([[[[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0],
…,
[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0],
[0, 0, 0, …, 0, 0, 0]],

     [[0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      ...,
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0]],

     [[0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      ...,
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0]],

     ...,

     [[0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      ...,
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0]],

     [[0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      ...,
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0]],

     [[0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      ...,
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0]]]], dtype=torch.uint8)

idx: tensor(0, dtype=torch.uint8) , klasse: tensor(0) , datasets[‘train’].data[idx]: tensor([], size=(0, 60000, 28, 28), dtype=torch.uint8)

I don’t get this output. I was expecting some identifier for idx, the class for klasse (which is probably correct). And why is datasets[‘train’].data[idx] = [] sometimes?