KeyError in Dataloader

praggya · March 2, 2022, 9:45pm

Traceback (most recent call last):
  File "EuroSAT_train.py", line 119, in <module>
    main()
  File "EuroSAT_train.py", line 76, in main
    for step, (x_spt, y_spt, x_qry, y_qry) in enumerate(db):
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise
    raise exception
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/voyager-volume/MAML-Pytorch/EuroSAT.py", line 156, in __getitem__
    for sublist in self.support_x_batch[index] for item in sublist]).astype(np.int32)
  File "/voyager-volume/MAML-Pytorch/EuroSAT.py", line 156, in <listcomp>
    for sublist in self.support_x_batch[index] for item in sublist]).astype(np.int32)
KeyError: '5594'

The code

euro = EuroSAT('/voyager-volume/MAML-Pytorch/EuroSAT/',mode='train', n_way=args.n_way, k_shot=args.k_spt,
                        k_query=args.k_qry,
                        batchsz=10000, resize=args.imgsz)
    def get_subset(indices, start, end):
        return indices[start:start+end]
    indices = torch.randperm(len(euro))
    train_indices = get_subset(indices, 0, len(euro))
    train_sampler = SubsetRandomSampler(train_indices)
    euro_test = EuroSAT('/voyager-volume/MAML-Pytorch/EuroSAT/', mode='test', n_way=args.n_way, k_shot=args.k_spt,
                             k_query=args.k_qry,
                             batchsz=100, resize=args.imgsz)

    for epoch in range(args.epoch//10000):
        # fetch meta_batchsz num of episode each time
        db = DataLoader(euro, args.task_num, sampler=train_sampler, shuffle=False, num_workers=1, pin_memory=True)

        for step, (x_spt, y_spt, x_qry, y_qry) in enumerate(db):


            x_spt, y_spt, x_qry, y_qry = x_spt.to(device), y_spt.to(device), x_qry.to(device), y_qry.to(device)

            accs = maml(x_spt, y_spt, x_qry, y_qry)

            if step % 30 == 0:
                print('step:', step, '\ttraining acc:', accs)

            if step % 500 == 0:  # evaluation
                db_test = DataLoader(euro_test, 1, shuffle=True, num_workers=1, pin_memory=True)
                accs_all_test = []

                for x_spt, y_spt, x_qry, y_qry in db_test:
                    x_spt, y_spt, x_qry, y_qry = x_spt.squeeze(0).to(device), y_spt.squeeze(0).to(device), \
                                                 x_qry.squeeze(0).to(device), y_qry.squeeze(0).to(device)

                    accs = maml.finetunning(x_spt, y_spt, x_qry, y_qry)
                    accs_all_test.append(accs)

                # [b, update_step+1]
                accs = np.array(accs_all_test).mean(axis=0).astype(np.float16)
                print('Test acc:', accs)

praggya · March 2, 2022, 9:46pm

Can anyone please tell me why am I facing this issue.Obliged…

ptrblck · March 3, 2022, 8:15am

Based on the stacktrace the Dataset fails to load:

for sublist in self.support_x_batch[index] for item in sublist]).astype(np.int32)

for an index of 5594.
Make sure you are defining the length of the Dataset properly and that 5594 is indeed a valid index.

Vidsha_Rupani · January 5, 2024, 12:22pm

While the code seems logically structured, there are several potential issues that could lead to KeyError in Dataloader or unintended behavior:

Division in range(args.epoch//10000): If args.epoch is less than 10000, this will result in a range of 0, meaning the outer loop won’t execute. Ensure that args.epoch is correctly set and divisible by 10000 to get the desired number of epochs.
DataLoader Initialization Inside Loop: The DataLoader for the test set (db_test) is initialized inside the loop for every 500 steps. This can be inefficient. If the test dataset (euro_test) does not change, you should initialize db_test outside of the loop.
Potential Memory Issue with pin_memory=True: Setting pin_memory=True can be beneficial as it enables faster data transfer to CUDA-enabled GPUs, but it also consumes more memory. Ensure you have sufficient memory resources, especially if working with large datasets.
Batch Size in Test DataLoader: The batch size for db_test is set to 1. This is common for evaluation, but if your maml.finetuning method expects a different batch size, this could cause issues.
Squeezing Tensor Dimensions: The use of squeeze(0) assumes that the first dimension of your tensors is 1 (common in single-batch scenarios). Be sure this assumption holds, as squeeze(0) will remove this dimension, which might cause dimensionality issues in your model.
Error Handling in Model Forward Pass: The lines accs = maml(x_spt, y_spt, x_qry, y_qry) and accs = maml.finetunning(x_spt, y_spt, x_qry, y_qry) do not include any error handling. Any issues within the model’s forward pass or the finetuning method will cause the script to fail.

smth · January 5, 2024, 1:41pm

@Vidsha_Rupani when you are using ChatGPT or other tools to post replies, please specify that clearly at the beginning of your comment