Docker, AssertionError: can only join a child process

I am running open source code inside docker environment. It is quite strange to encounter the error inside the docker containers, while others run on PC do not encounter the issue.

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fae49c835f0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/", line 1101, in __del__
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/", line 1075, in _shutdown_workers
  File "/usr/lib/python3.7/multiprocessing/", line 138, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process

The parameters to create a container

docker run --shm-size 64G --gpus all -d -v /xxx:/xxx --network host --name xxx_name --cpus 40 xxx_image

It seems that an error is raised, the DataLoaderIter dies and this multiprocessing error is raised instead.
Could you run the code with a single worker (num_workers=0) and check, if another error might be raised?

If with one single worker, there is no error.

Same here, code worked with num_workers=0, 1, 2, but saw a lot of these errors when num_worers>=3.
For me increasing the shared memory size (–shm-size 256M → 1G) solved the problem, now works fine with num_workers=12.