Dataloader errir

Exception ignored in: <function _DataLoaderIter.__del__ at 0x7fb88e289620>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 399, in __del__
    self._shutdown_workers()
  File "/usr/local/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
    self.worker_result_queue.get()
  File "/usr/local/lib/python3.7/multiprocessing/queues.py", line 354, in get
    return _ForkingPickler.loads(res)
  File "/usr/local/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/local/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/usr/local/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
  File "/usr/local/lib/python3.7/multiprocessing/reduction.py", line 153, in recvfds
    msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size))
ConnectionResetError: [Errno 104] Connection reset by peer

I’ve got this error, which was ignored but what does it mean?

Seems like you are running your data loading with multiple workers. Can you try to run the script again with num_workers=0?

Well, that’s the case yeh. I was wondering what does error means since it has been ignored (therefore i kept training) But It happened once in thousands of iterations. Besides it says connection reset by peer.

I’m interested in the interpretation of the error

Is there anything above the traceback? The message you printed simply states that an error occurred within one of the workers and the actual message should be stated above this message (or won’t show until you run with num_workers =0)