BrokenPipeError cnn

I am trying to run the CNN on digitalRecognizer dataset. I am getting BrokenPipe error while running the optimizer

         for epoch in range(5):
         for i, data in enumerate(train_loader):
               inputs, labels = data
               inputs = Variable(inputs).cuda()
               labels = Variable(labels).cuda()
 
               optimizer.zero_grad()
               outputs = model_pytorch(inputs)
               loss = loss_function(outputs, labels)
               loss.backward()
               optimizer.step()
    
              if (i+1) % 5 == 0:
                     print('[%d, %5d] loss: %.4f' % (num_epochs, i+1, loss.data[0]))

This is the Error I am getting

BrokenPipeError Traceback (most recent call last)
in
1 for epoch in range(5):
----> 2 for i, data in enumerate(train_loader):
3 inputs, labels = data
4 inputs = Variable(inputs).cuda()
5 labels = Variable(labels).cuda()

c:\users\pramod\miniconda3\lib\site-packages\torch\utils\data\dataloader.py in iter(self)
817
818 def iter(self):
–> 819 return _DataLoaderIter(self)
820
821 def len(self):

c:\users\pramod\miniconda3\lib\site-packages\torch\utils\data\dataloader.py in init(self, loader)
558 # before it starts, and del tries to join but will get:
559 # AssertionError: can only join a started process.
–> 560 w.start()
561 self.index_queues.append(index_queue)
562 self.workers.append(w)

c:\users\pramod\miniconda3\lib\multiprocessing\process.py in start(self)
110 ‘daemonic processes are not allowed to have children’
111 _cleanup()
–> 112 self._popen = self._Popen(self)
113 self._sentinel = self._popen.sentinel
114 # Avoid a refcycle if the target function holds an indirect

c:\users\pramod\miniconda3\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
–> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):

c:\users\pramod\miniconda3\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
–> 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):

c:\users\pramod\miniconda3\lib\multiprocessing\popen_spawn_win32.py in init(self, process_obj)
87 try:
88 reduction.dump(prep_data, to_child)
—> 89 reduction.dump(process_obj, to_child)
90 finally:
91 set_spawning_popen(None)

c:\users\pramod\miniconda3\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 ‘’‘Replacement for pickle.dump() using ForkingPickler.’’’
—> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #

BrokenPipeError: [Errno 32] Broken pipe

Can anyone Help

Could you set num_workers=0 and try to run the code again?
There might be some issues in e.g. the data loading as described here.

1 Like

Yes, I guess it worked but it ran into AssertionError something with versions of CUDA and PyTorch, I am trying to find the solution for it. Thank you for the help

Could you post the stack trace and run your code again with

CUDA_LAUNCH_BLOCKING=1 python script.py args

Also, is your code running fine on the CPU? The errors might be easier to debug on the CPU.

It was saying “Found too old version of NVIDIA”. I tried all the possibilities, changed my CUDA version PyTorch version nothing helped. I got rid of cuda(). It certainly made my code slow but it works fine now.

Did the error point to the drivers or your card?
Could you check your driver using nvidia-smi and also post which GPU you are using?