I use custom coco type data for training, but I encounter index overflow issues during the training process, as shown below:
Traceback (most recent call last):
File ".../engine/train_loop.py", line 134, in train
self.run_step()
File ".../engine/defaults.py", line 429, in run_step
self._trainer.run_step()
File ".../engine/train_loop.py", line 222, in run_step
data = next(self._data_loader_iter)
File ".../data/common.py", line 179, in __iter__
for d in self.dataset:
File ".../python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
data = self._next_data()
File ".../python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
return self._process_data(data)
File ".../python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File ".../python3.9/site-packages/torch/_utils.py", line 543, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File ".../python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File ".../python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File ".../python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File ".../data/common.py", line 43, in __getitem__
data = self._map_func(self._dataset[cur_idx])
File ".../data/common.py", line 107, in __getitem__
start_addr = 0 if idx == 0 else self._addr[idx - 1].item()
IndexError: index 102724527 is out of bounds for axis 0 with size 15000
IndexError: index 161312927 is out of bounds for axis 0 with size 15000
I have traced the issue, which is caused by data index loading; But I’m not sure why this problem occurs.
But interestingly, when I set num_workers to 1, the program will not report an error, but when I set num_workers to greater than 1, an error will be reported during the training.