Error while Multiprocessing in Dataloader

Not sure if this is reported already but I am getting the following Assertion error in Dataloader

Exception ignored in: <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7fae94071d30>>
Traceback (most recent call last):
File “/home/amit/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 677, in del
self._shutdown_workers()
File “/home/amit/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py”, line 659, in _shutdown_workers
w.join()
File “/usr/lib/python3.6/multiprocessing/process.py”, line 122, in join
assert self._parent_pid == os.getpid(), ‘can only join a child process’
AssertionError: can only join a child process

3 Likes

have you found a solution?

I am using num_workers with IterableDataset and it also has this problem.

1 Like

yes…
same here.

testset = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True,num_workers=0)

But i just worried that it is possible to use only my local testing.
So I just want to know what is root cause and a solution.

^^

did you solve it. Or it is the problem of num_workers.

Well I am getting the same error, it says can only join a child process. I do not know what that means??

I was having this issue. Turns out its because there was an error in the dataset object (for me it was in the __getitem__ function). I guess the DataLoader in multiprocessing mode doesn’t know how to cleanly provide you with the internal error message. If you have the same problem, try running with num_workers = 0 (single-threaded) and it should tell you what the error is. Once you’ve fixed the error, it should work with num_workers > 0.

1 Like