[transfer learning tutorial] ConnectionResetError: [Errno 104] Connection reset by peer

Running the Python code for transfer learning I get the following error after training and val is finished successfully:

Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f5e12c249b0>>
Traceback (most recent call last):
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 399, in __del__
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/multiprocessing/queues.py", line 337, in get
    return _ForkingPickler.loads(res)
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
    fd = df.detach()
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/multiprocessing/reduction.py", line 182, in recv_handle
    return recvfds(s, 1)[0]
  File "/scratch/sjn-p3/anaconda/anaconda3/lib/python3.6/multiprocessing/reduction.py", line 153, in recvfds
    msg, ancdata, flags, addr = sock.recvmsg(1, socket.CMSG_LEN(bytes_size))
ConnectionResetError: [Errno 104] Connection reset by peer

Your issue might be related to this thread.
Which PyTorch version are you using? As the PR was recently merged, you might want to install the preview or build from source.

1 Like

Hello, I installed this following the command given in pytorch official website.

I have

>>> torch.__version__

and I have:

Python 3.6.4 |Anaconda custom (64-bit)| (default, Jan 16 2018, 18:10:19)