I’m training with a DataLoader and it randomly crashes with this error after three epochs:
Traceback (most recent call last):
File "train.py", line 46, in <module>
for batch_idx, (song, label) in enumerate(train_loader):
File "/home/sauhaarda/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 280, in __next__
idx, batch = self._get_batch()
File "/home/sauhaarda/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 259, in _get_batch
return self.data_queue.get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 343, in get
res = self._reader.recv_bytes()
File "/usr/lib/python3.5/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
File "/home/sauhaarda/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 178, in
handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 9777) exited unexpectedly with exit code 1.
My code is available here: