janders111
(Jordan Andersen)
August 20, 2019, 8:06pm
1
It freezes hangs right at the beginning. Seems to be something to do with the multiprocessing/queues.py. I have already read some other posts and tried:
if __name__ == "__main__"
running in Administrator mode
reinstalling pytorch
python3.7 on Windows 10, latest stable pyTorch build 1.2
from torch.utils.data.dataset import Dataset
from torch.utils.data import DataLoader
class DriveData(Dataset):
def __init__(self):
self.data = [1, 2, 3, 4, 5, 6]
# Override to give PyTorch access to any image on the dataset
def __getitem__(self, index):
return self.data[index]
# Override to give PyTorch size of dataset
def __len__(self):
return len(self.data)
def main():
dset_train = DriveData()
train_loader = DataLoader(dset_train, batch_size=2, shuffle=True, num_workers=1)
for i, data in enumerate(train_loader):
print(i)
print(data)
if __name__ == "__main__":
main()
Output when num_workers is 0:
0
tensor([2, 6])
1
tensor([1, 4])
2
tensor([5, 3])
No output when num_workers is >0. Just hangs.
I tried the 1.2.0 + cuda 10.0 + python 3.6 package, which can’t reproduce this issue.
ilyes
(ilyes)
August 27, 2019, 10:33am
3
Did you copy paste exactly his code ? because I tried it myself and I had the same issue!
ilyes
(ilyes)
August 27, 2019, 10:34am
4
Have you fixed the problem ?
peterjc123
(Pu Jiachen)
August 27, 2019, 12:43pm
5
Yes, I didn’t change anything.
peterjc123
(Pu Jiachen)
August 27, 2019, 12:57pm
6
FYI, I’m using Python 3.6 and CUDA 10.0.
ilyes
(ilyes)
August 27, 2019, 1:06pm
7
yeah! probably python 3.7 is doing the problem
ilyes
(ilyes)
August 27, 2019, 4:20pm
8
I used Python 3.6.9 and CUDA 10.0 and pytorch 1.2.0 and it doesnt work !
Would you please send a bug report on https://github.com/pytorch/pytorch/issues ? BTW, what is the traceback if you press ctrl+c?
ilyes
(ilyes)
August 28, 2019, 10:28am
10
I reported the issue.
by traceback you mean the error text, I didnt get you ? I am using jupyter notebook btw
peterjc123
(Pu Jiachen)
August 28, 2019, 10:44am
11
Yes, I mean the error text if you kill that process at background. BTW, is it reproducible if you run it through command prompt?
ilyes
(ilyes)
August 28, 2019, 12:11pm
12
same error when run from the command prompt. Here’s the error message:
BrokenPipeError Traceback (most recent call last)
<ipython-input-10-344640e27da1> in <module>
----> 1 final_model, hist = train_model(model, dataloaders_dict, criterion, optimizer)
<ipython-input-9-fdf91f815fa7> in train_model(model, dataloaders, criterion, optimizer, num_epochs)
23 # Iterate over data.
24 end = time.time()
---> 25 for i, (inputs, labels) in enumerate(dataloaders[phase]):
26 inputs = inputs.to(device, non_blocking=True)
27 labels = labels.to(device , non_blocking=True)
~\Anaconda3\envs\py_gpu\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
276 return _SingleProcessDataLoaderIter(self)
277 else:
--> 278 return _MultiProcessingDataLoaderIter(self)
279
280 @property
~\Anaconda3\envs\py_gpu\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
680 # before it starts, and __del__ tries to join but will get:
681 # AssertionError: can only join a started process.
--> 682 w.start()
683 self.index_queues.append(index_queue)
684 self.workers.append(w)
~\Anaconda3\envs\py_gpu\lib\multiprocessing\process.py in start(self)
110 'daemonic processes are not allowed to have children'
111 _cleanup()
--> 112 self._popen = self._Popen(self)
113 self._sentinel = self._popen.sentinel
114 # Avoid a refcycle if the target function holds an indirect
~\Anaconda3\envs\py_gpu\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
--> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):
~\Anaconda3\envs\py_gpu\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
--> 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):
~\Anaconda3\envs\py_gpu\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
87 try:
88 reduction.dump(prep_data, to_child)
---> 89 reduction.dump(process_obj, to_child)
90 finally:
91 set_spawning_popen(None)
~\Anaconda3\envs\py_gpu\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #
BrokenPipeError: [Errno 32] Broken pipe
ilyes
(ilyes)
August 28, 2019, 1:23pm
13
this issue is weird! My code runs on Colab smoothly, so I created an envirnment locally with EXACTLY the same versions of python 3.6.8, pytorch 1.1.0, torchvision 0.3.0, and cudatoolkit 10.0.130. Still having the same bug!
peterjc123
(Pu Jiachen)
August 29, 2019, 1:42am
14
What about using python instead of ipython?