OSError on Colab

(Ben Ogie) #1

I keep getting this error on google colab after training for some epochs using pytorch 0.4.1.

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-5-c9e4acd77c1d> in <module>()
     14   # Train
     15 
---> 16   for data, label in train_dataloader:
     17 
     18     if torch.cuda.is_available():

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in __next__(self)
    334                 self.reorder_dict[idx] = batch
    335                 continue
--> 336             return self._process_next_batch(batch)
    337 
    338     next = __next__  # Python 2 compatibility

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
    355         self._put_indices()
    356         if isinstance(batch, ExceptionWrapper):
--> 357             raise batch.exc_type(batch.exc_msg)
    358         return batch
    359 

OSError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/folder.py", line 101, in __getitem__
    sample = self.loader(path)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/folder.py", line 147, in default_loader
    return pil_loader(path)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/datasets/folder.py", line 129, in pil_loader
    img = Image.open(f)
  File "/usr/local/lib/python3.6/dist-packages/PIL/Image.py", line 2618, in open
    prefix = fp.read(16)
OSError: [Errno 5] Input/output error

I have tried restarting my session, but the error just wont go away.

#2

I don’t know how well the dataloader handles I/O errors like that. Does upgrading to pytorch 1.0 help at all?

(Ben Ogie) #3

I haven’t tried upgrading yet. But i wish i knew what caused the error.

(Salman Nauman) #4

Hi @Ben_Ogie, were you able to fix this error? This error is pretty random. The same worked fine earlier on google colab and it’s not working fine anymore now. Any ideas why this is happening?

(Chingyu) #5

I got the same issue sometimes. I think this is due to there are too many files in one folder, though sometimes it will work without any modification after a few trials. Maybe you can check this reference.
https://research.google.com/colaboratory/faq.html#drive-timeout

1 Like