Can't iter(Dataloader Object).BrokenPipeError

Sorry for my terrible English…

I run the code from 60mins tutorial,like this.

trainset = torchvision.datasets.CIFAR10(root=’./datasets’, train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
dataiter = iter(trainloader)

And return a error.

File “C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 705, in runfile
execfile(filename, namespace)
File “C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py”, line 102, in execfile
exec(compile(f.read(), filename, ‘exec’), namespace)
File “C:/Users/flow_/Documents/cifarten/cf10.py”, line 36, in
dataiter = iter(trainloader)
File “C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py”, line 451, in iter
return _DataLoaderIter(self)
File “C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py”, line 239, in init
w.start()
File “C:\ProgramData\Anaconda3\lib\multiprocessing\process.py”, line 105, in start
self._popen = self._Popen(self)
File “C:\ProgramData\Anaconda3\lib\multiprocessing\context.py”, line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File “C:\ProgramData\Anaconda3\lib\multiprocessing\context.py”, line 322, in _Popen
return Popen(process_obj)
File “C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py”, line 65, in init
reduction.dump(process_obj, to_child)
File “C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py”, line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

There is an issue regarding multi-processing on Windows machines, since apparently Windows subprocesses will import (i.e. execute) the main module at start, which will result in recursively creating subprocesses.
Try to protect your code in if __name__ == '__main__'.

Also you could check, if this is the error by setting num_workers=0 and running it again.

5 Likes

I got similar error on Ubuntu, which is: module multiprocessing.util’ has no attribute '_flush_std_streams
how to fix it?

The method is defined in the Python lib multiprocessing, and was not introduced in PyTorch.
Could you check if you can update the multiprocessing lib or re-install it?

Thank you. I fixed it.

That works. Thanks a lot.

Thank you for your explanation! The problem seems to disappear if I run the same code (without if __name__ == '__main__') in Jupyter notebook in Windows.