KeyError when enumerating over dataloader

vishnurapps · August 19, 2020, 10:21am

You are a life saver @ptrblck

EricWiener · October 12, 2020, 5:15pm

This worked great. Thanks! In case it helps anyone else, I was getting an error pytorch empty range for randrange() (0,-72, -72) after doing this. Turned out that the dataset was much worse than I thought it was. I was using a batch size of 16, but one of the classes only had 7 images in it. Additionally, I was cropping to 224x224, but some of the images in the dataset were 50x50. Both these issues had to be fixed to solve the problem.

Vahid_Chahkandi · November 24, 2020, 8:53pm

I got the same error, but there is no index and the error happens exactly when for loop command is execute, here is the traceback:

File "/media/deeplab/f6321bd3-2eb4-461a-9abc-d10e94252592/Vahid Chahkandi/dl4mt-nonauto-master/run.py", line 611, in <module>
    names=["test." + xx for xx in names], maxsteps=None)
  File "/media/deeplab/f6321bd3-2eb4-461a-9abc-d10e94252592/Vahid Chahkandi/dl4mt-nonauto-master/decode.py", line 173, in decode_model
    for iters, dev_batch in enumerate(dev):
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
    data = self._next_data()
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 989, in _next_data
    return self._process_data(data)
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1014, in _process_data
    data.reraise()
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 185, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 84, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 82, in default_collate
    raise RuntimeError('each element in list of batch should be of equal size')
RuntimeError: each element in list of batch should be of equal size


If you suspect this is an IPython 7.18.1 bug, please report it at:
    https://github.com/ipython/ipython/issues
or send an email to the mailing list at ipython-dev@python.org

You can print a more detailed traceback right now with "%tb", or use "%debug"
to interactively debug it.

Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
    %config Application.verbose_crash=True

Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/utils/data/_utils/pin_memory.py", line 25, in _pin_memory_loop
    r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
  File "/usr/lib/python3.7/multiprocessing/queues.py", line 113, in get
    return _ForkingPickler.loads(res)
  File "/home/deeplab/dl4mt-nonauto-master/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/lib/python3.7/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/usr/lib/python3.7/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/usr/lib/python3.7/multiprocessing/connection.py", line 498, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python3.7/multiprocessing/connection.py", line 746, in answer_challenge
    response = connection.recv_bytes(256)        # reject large message
  File "/usr/lib/python3.7/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.7/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

start decoding:   0%|          | 0/200 [00:00<?, ?it/s]

Process finished with exit code 1

I would appreciate it any advise.

ptrblck · November 26, 2020, 9:14am

Based on the error message:

RuntimeError: each element in list of batch should be of equal size

it seems your Dataset.__getitem__ method returns tensors of variable shape, which cannot be stacked to a single batch in the default collate function. You would have to either resize the tensors to the same shape (or pad etc.) or use a custom collate_fn to return a list of these tensors instead of a batched tensor.

shakeel608 · April 1, 2021, 9:20am

@ptrblck
Since the length of my trainset is 3757 and it is working fine with the dataloader
but when I try to use it with trainset object

for idx, (data, label) in enumerate(train_set):
      print(idx, data,.shape label)

It is giving key error at the last index as KeyError: 3757. I am confused here

#train_set is dataset object not a dataloader object here

ptrblck · April 1, 2021, 4:30pm

Could you try to load this sample directly via:

data, label = train_set[3757]

and see, where it’s exactly failing?
Based on the description I guess the __len__ method returns a wrong length of the dataset or the internal pd.DataFrame or dict is failing, as this key is indeed not set.

borteley · May 4, 2021, 1:51pm

Hi ptrblck,

I’m new to pytorch and I’ve got a KEY ERROR issue below: Please help me solve it.

I have a created custom dataset class to access category columns for embedding and numerical columns separately, but I get key error when I try to iterate through elements of the data. Please see my code below:

class LoanDataset(Dataset):

def __init__(self, X, Y, emb_cols):
    X = X.copy()
    self.X1 = X.loc[:,emb_cols].copy().values.astype(np.int64) #categorical columns
    self.X2 = X.drop(columns=emb_cols).copy().values.astype(np.float32) #numerical columns
    self.y = Y
    
def __len__(self):
    return len(self.y)

def __getitem__(self, idx):
    return self.X1[idx], self.X2[idx], self.y[idx]

Next, I apply this to my x_train and x_test data like so:

train_ds = LoanDataset(x_train, y_train, emb_cols)

test_ds = LoanDataset(x_test, y_test, emb_cols)

batch_size = 64
traindl = DataLoader(train_ds, batch_size=batch_size,shuffle=True)
testdl = DataLoader(test_ds, batch_size=batch_size,shuffle=True)

I try to view elements of the train_ds with code below and get I key error:1

i = 1
for X1, X2, y in train_ds:
print(‘batch_num:’, i)
i = i+1
print(X1,X2,y)

So I tried to view train_ds[1] and I get key error 1 as well. PLEASE HELP. Would appreciate your support, Thanks

ptrblck · May 4, 2021, 6:47pm

Could you check the len of each stored array (you might also want to transform the numpy arrays to PyTorch tensors) and compare it to the passed idx?
Based on your code snippet it seems you are trying to index numpy arrays created from the pd.DataFrames, so I would expect to see an index error instead of a key error (which would be raised by pandas).

borteley · May 4, 2021, 7:37pm

Hi ptrblck, the len of X1 = len(X2) = len(y) = 35046 which is also the same as the number of rows in my training samples ie. x_train and y_train. x_train and y_train are dataframes.

I just run the code again to view the batches and it run well until batch num 8, where another key error occurs.

ptrblck · May 4, 2021, 7:46pm

Could you post the complete stack trace with the error message, please?

borteley · May 4, 2021, 7:50pm

batch_num: 1
[1 2 2 1 2 2 2 2 2] [ 2.3251567e+05 1.1239830e+00 -1.6585710e+00 -3.4774102e-02
1.9383546e-02 -2.5244788e-04 -7.7740820e-03 -1.4346160e-03
-3.4194162e-01 -2.5698951e-01 -2.9680127e-01 -3.0806255e-01
-3.1413612e-01 -2.9338205e-01] 1
batch_num: 2
[0 2 1 1 1 1 1 0 1] [-1.1748432e+05 1.0158546e+01 -1.6365118e+00 -2.9956434e-02
2.0324662e-02 1.2480208e-03 -7.6914816e-03 -1.4271387e-03
-2.9436108e-01 -2.2278552e-01 -2.6821393e-01 -2.9018584e-01
-2.8849858e-01 -2.8082252e-01] 1
batch_num: 3
[0 1 2 1 2 2 2 2 2] [ 5.2515676e+04 -3.0254190e+00 -1.6585710e+00 -3.4774102e-02
1.9383546e-02 -2.5244788e-04 -7.7740820e-03 -1.4346160e-03
-3.4194162e-01 -2.5698951e-01 -2.9680127e-01 -3.0806255e-01
-3.1413612e-01 -2.9338205e-01] 0
batch_num: 4
[0 3 2 0 0 0 0 0 2] [-1.1748432e+05 9.7223682e+00 -8.5289121e-01 4.8334992e-01
1.6759698e-01 -1.1039849e-01 4.9512643e-02 -1.6527575e-01
-2.2119057e-01 -1.7018579e-01 -1.8320794e-01 -3.0806255e-01
-3.1413612e-01 -2.9338205e-01] 0
batch_num: 5
[1 2 2 0 0 0 0 0 0] [-1.1748432e+05 -1.0277632e+01 -3.8125306e-01 3.9748722e-01
3.1777948e-02 1.8403894e-01 -3.8792774e-02 5.6891102e-02
-1.8496527e-01 -1.7018579e-01 -2.5533971e-01 -2.5699621e-01
-2.5522807e-01 -2.5119311e-01] 0
batch_num: 6
[0 1 2 0 0 0 0 2 1] [ 2.4251567e+05 -5.4867134e+00 -7.5418085e-01 -2.7809170e-01
-2.1490552e-01 4.1536082e-02 1.1691498e-01 -9.6467867e-02
3.1654753e-02 -2.0604056e-01 2.5987735e-01 -3.0806255e-01
9.2520958e-01 -2.9338205e-01] 0
batch_num: 7
[0 2 1 2 2 2 2 2 0] [-1.7484322e+04 1.6565609e+01 -1.2126069e+00 -9.2393309e-02
1.2998304e-02 -1.6868382e-03 -5.3660311e-03 -7.8853276e-03
-3.0428034e-01 -2.2991614e-01 -2.2594361e-01 -3.0806255e-01
-2.7739024e-01 -2.2320399e-01] 0
batch_num: 8
[1 1 2 0 0 0 2 1 1] [ 1.3251567e+05 -6.8479071e+00 -1.0590357e+00 6.7428279e-01
-5.2545774e-01 -2.0227760e-01 2.7863738e-01 1.3272280e-01
-2.5872529e-01 -2.3145868e-01 -2.9520598e-01 -2.4666722e-01
-2.0267007e-01 1.3281687e+00] 0

KeyError Traceback (most recent call last)
~/.virtualenv/lib64/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2890 try:
→ 2891 return self._engine.get_loc(casted_key)
2892 except KeyError as err:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 8

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
in
1 i = 1
2
----> 3 for X1, X2,Y in train_ds:
4 print(‘batch_num:’, i)
5 i = i+1

in getitem(self, idx)
10
11 def getitem(self, idx):
—> 12 return self.X1[idx], self.X2[idx], self.Y[idx]

~/.virtualenv/lib64/python3.6/site-packages/pandas/core/series.py in getitem(self, key)
880
881 elif key_is_scalar:
→ 882 return self._get_value(key)
883
884 if (

~/.virtualenv/lib64/python3.6/site-packages/pandas/core/series.py in _get_value(self, label, takeable)
989
990 # Similar to Index.get_value, but we do not fall back to positional
→ 991 loc = self.index.get_loc(label)
992 return self.index._get_values_for_loc(self, loc, label)
993

~/.virtualenv/lib64/python3.6/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2891 return self._engine.get_loc(casted_key)
2892 except KeyError as err:
→ 2893 raise KeyError(key) from err
2894
2895 if tolerance is not None:

KeyError: 8

borteley · May 4, 2021, 8:40pm

ptrblck i found out that y is a pandas.series type whereas x1 and x2 are nd.arrays. The for loop is running well with no errors after I chaned the dtype of y to nd.array. Thanks for help

hiramustafa77 · December 10, 2021, 10:50am

Hi, I got the index number still I could not figure out the issue with that image because the index number keeps changing 1st time i was 195 later 101, this is the overall error I am getting:
195 #this is the printed index number
Traceback (most recent call last):
File “train.py”, line 57, in
for i, (images, density, att) in enumerate(train_loader):
File “/home/hiramustafa/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py”, line 521, in next
data = self._next_data()
File “/home/hiramustafa/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py”, line 561, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File “/home/hiramustafa/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py”, line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/hiramustafa/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py”, line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/hiramustafa/Downloads/Thesis_new/dataset.py”, line 32, in getitem
label = h5py.File(self.label_list[index], ‘r’)
IndexError: list index out of range

ptrblck · December 10, 2021, 7:54pm

Are you using shuffle=True? If so, disable it and check if the index error is raised with the same value. Alternatively, iterate the Dataset instead of the DataLoader to narrow down which sample fails to load.

doniervask · May 2, 2022, 6:09am

This annoying error means that Pandas can not find your column name in your dataframe. Before doing anything with the data frame, use print(df.columns) to see dataframe column exist or not.

print(df.columns)

I was getting a similar kind of error in one of my codes. Turns out, that particular index was missing from my data frame as I had dropped the empty dataframe 2 rows. If this is the case, you can do df.reset_index(inplace=True) and the error should be resolved.

Maria_Shaukat · December 7, 2022, 1:04am

I was having the same error. Turns out that the issue was with label (i.e. y) array.
The labels that I was using to make a custom dataset and then the dataset was being used in dataloader. This labels array was actually coming from a pandas dataframe and it still contained the original indices from pandas.

simple doing labels.to_numpy() and then using labels in the custom dataset resolved the issue.

ammar_siddiqui · January 10, 2023, 8:49am

Hi, I am getting the KeyError on different indexes of the dataset.
KeyError Traceback (most recent call last) File c:\Users\SAMSUNG.envs\WAS\lib\site-packages\pandas\core\indexes\base.py:
3803 , in Index.get_loc**(self, key, method, tolerance)**
3802 try: → 3803 return self._engine.get_loc(casted_key)
3804 except KeyError as err:
File c:\Users\SAMSUNG.envs\WAS\lib\site-packages\pandas_libs\index.pyx:138 , in pandas._libs.index.IndexEngine.get_loc**()**
File c:\Users\SAMSUNG.envs\WAS\lib\site-packages\pandas_libs\index.pyx:165 , in pandas._libs.index.IndexEngine.get_loc**()**
File pandas_libs\hashtable_class_helper.pxi:5745 , in pandas._libs.hashtable.PyObjectHashTable.get_item**()**
File pandas_libs\hashtable_class_helper.pxi:5753 , in pandas._libs.hashtable.PyObjectHashTable.get_item**()**
KeyError : 690

Jaykumaran · February 20, 2025, 2:04pm

There is a high likely chance that the annotations for the idx are empty. When dataloader tries to fetch labels parallely for the image indices and all if they empty then it will throw this error.

ValueError: Caught ValueError in DataLoader worker process 0.
and

Keyerror()

To prevent this, ensure there are no empty label files by performing a sanity check. Always run next(iter(valid_loader)) to verify that the DataLoader preparation is correct. Spending some time inspecting the data upfront can save a lot of debugging time later.