Error that I haven't understand and solve

Hi all,

I get this error when I have write:

 batch_size = 3
 train_loader = DataLoader(dataset     = train_dataset
                     ,batch_size  = batch_size
                     ,shuffle     = False
                     ,num_workers = 2)
 for original_img, noisy_img in train_loader: 
            originalImage = original_img

Can someone please explain to me what that error means?
The following error is:


TypeError Traceback (most recent call last)
in
----> 1 for original_img, noisy_img in train_loader:
2 originalImage = original_img

RuntimeError: DataLoader worker (pid(s) 24484, 1632) exited unexpectedly

Try to debug setting num_workers=0. It just means that you had an exception in some dataloader but you don’t know which one

Hi @JuanFMontesinos

First thanks a lot for your answer.

I have tried num_workers=0 but it gave me another error. I think I have an error in my code but I don’t known where???

Here is the complete code:

 import os
 import glob

 import torch
 import torch.cuda as cuda
 import torch.nn as nn
 from torch.nn import functional as F
 from torch.utils import data

 import matplotlib.pyplot as plt
 import matplotlib.image as matpltimg

 import torchvision
 from torchvision import datasets
 from torchvision import transforms

 import numpy as np

 from torch.cuda import amp

 datadir  = '/DataSets/SimplifyData'

  traindir_OI = datadir + '/imagesForTrain/OriginalImages/'
  traindir_NI = datadir + '/imagesForTrain/NoiseImages/'

  list1 = os.listdir(traindir_OI) 
  number_files1 = len(list1)
  print(number_files1)

  list2 = os.listdir(traindir_NI) 
  number_files2 = len(list2)
  print(number_files2)

 class Dataset(data.Dataset):
       def __init__(self, path_input_1, path_input_2):
                self.filenames_OI = []
                self.filenames_NI = []
                self.path_input_1 = path_input_1
                self.path_input_2 = path_input_2

                filenames_OI = glob.glob(osp.join(path_input_1, '*.jpg'))
                filenames_NI = glob.glob(osp.join(path_input_2, '*.jpg'))
                print(len(filenames_OI)) 
                print(len(filenames_NI))
    
                for fn_OI, fn_NI in zip(filenames_OI, filenames_NI):
                           self.filenames_OI.append(fn_OI)
                           self.filenames_NI.append(fn_NI)

    def __len__(self):
           return len(self.path_input_1) 

    def __getitem__(self, index):
          x = Image.open(self.filenames_OI[index])
          y = Image.open(self.filenames_NI[index])
         return X, y

train_dataset = Dataset(path_input_1 = traindir_OI
                                    ,path_input_2 = traindir_NI)
batch_size = 3

train_loader = DataLoader(dataset     = train_dataset
                     ,batch_size  = batch_size
                     ,shuffle     = False
                     ,num_workers = 0)

for original_img, noisy_img in train_loader:           
        originalImage = original_img
        noisyImage = noisy_img

When I write num_workers = 1 I get this error:


Empty Traceback (most recent call last)
~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _try_get_data(self, timeout)
769 try:
–> 770 data = self._data_queue.get(timeout=timeout)
771 return (True, data)

~.conda\envs\pytorch2\lib\multiprocessing\queues.py in get(self, block, timeout)
104 if not self._poll(timeout):
–> 105 raise Empty
106 elif not self._poll():

Empty:

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)
in
----> 1 for original_img, noisy_img in train_loader:
2 originalImage = original_img

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in next(self)
352
353 def next(self):
–> 354 data = self._next_data()
355 self._num_yielded += 1
356 if self._dataset_kind == _DatasetKind.Iterable and \

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
963
964 assert not self._shutdown and self._tasks_outstanding > 0
–> 965 idx, data = self._get_data()
966 self._tasks_outstanding -= 1
967

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _get_data(self)
930 else:
931 while True:
–> 932 success, data = self._try_get_data()
933 if success:
934 return data

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _try_get_data(self, timeout)
781 if len(failed_workers) > 0:
782 pids_str = ', '.join(str(w.pid) for w in failed_workers)
–> 783 raise RuntimeError(‘DataLoader worker (pid(s) {}) exited unexpectedly’.format(pids_str))
784 if isinstance(e, queue.Empty):
785 return (False, None)

RuntimeError: DataLoader worker (pid(s) 16460) exited unexpectedly

and when I write num_workers = 0 I get this error:


TypeError Traceback (most recent call last)
in
----> 1 for original_img, noisy_img in train_loader:
2 originalImage = original_img

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in next(self)
352
353 def next(self):
–> 354 data = self._next_data()
355 self._num_yielded += 1
356 if self._dataset_kind == _DatasetKind.Iterable and \

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
392 def _next_data(self):
393 index = self._next_index() # may raise StopIteration
–> 394 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
395 if self._pin_memory:
396 data = _utils.pin_memory.pin_memory(data)

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\fetch.py in fetch(self, possibly_batched_index)
45 else:
46 data = self.dataset[possibly_batched_index]
—> 47 return self.collate_fn(data)

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\collate.py in default_collate(batch)
82 raise RuntimeError(‘each element in list of batch should be of equal size’)
83 transposed = zip(*batch)
—> 84 return [default_collate(samples) for samples in transposed]
85
86 raise TypeError(default_collate_err_msg_format.format(elem_type))

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\collate.py in (.0)
82 raise RuntimeError(‘each element in list of batch should be of equal size’)
83 transposed = zip(*batch)
—> 84 return [default_collate(samples) for samples in transposed]
85
86 raise TypeError(default_collate_err_msg_format.format(elem_type))

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\collate.py in default_collate(batch)
84 return [default_collate(samples) for samples in transposed]
85
—> 86 raise TypeError(default_collate_err_msg_format.format(elem_type))

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘PIL.JpegImagePlugin.JpegImageFile’>

Do you have any idea on how to solve this problem?

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘PIL.JpegImagePlugin.JpegImageFile’>

This roughly means that you are loading images but you are not converting them to pytorch tensors.
I would tell you to use torchvision reader or imageio. Makes no sense to use PIL if you are not doing any preprocess

1 Like

Hi @JuanFMontesinos

Thanks a lot for your answer.

Can you please show me with an example or change directly in the code, just for learning and improve my very modest level.

Jus replace the Image.open by any other like

    def __getitem__(self, index):
          x = imageio.imread(self.filenames_OI[index])
          y = imageio.imread(self.filenames_NI[index])
         return X, y

There are many libraries to open images.

1 Like

Hi @JuanFMontesinos

Thanks a lot for this example.

I have change with imageio.imread() but I get the same error:


TypeError Traceback (most recent call last)
in
----> 1 for original_img, noisy_img in train_loader:
2 originalImage = original_img

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in next(self)
352
353 def next(self):
–> 354 data = self._next_data()
355 self._num_yielded += 1
356 if self._dataset_kind == _DatasetKind.Iterable and \

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
392 def _next_data(self):
393 index = self._next_index() # may raise StopIteration
–> 394 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
395 if self._pin_memory:
396 data = _utils.pin_memory.pin_memory(data)

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\fetch.py in fetch(self, possibly_batched_index)
45 else:
46 data = self.dataset[possibly_batched_index]
—> 47 return self.collate_fn(data)

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\collate.py in default_collate(batch)
82 raise RuntimeError(‘each element in list of batch should be of equal size’)
83 transposed = zip(*batch)
—> 84 return [default_collate(samples) for samples in transposed]
85
86 raise TypeError(default_collate_err_msg_format.format(elem_type))

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\collate.py in (.0)
82 raise RuntimeError(‘each element in list of batch should be of equal size’)
83 transposed = zip(*batch)
—> 84 return [default_collate(samples) for samples in transposed]
85
86 raise TypeError(default_collate_err_msg_format.format(elem_type))

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\collate.py in default_collate(batch)
84 return [default_collate(samples) for samples in transposed]
85
—> 86 raise TypeError(default_collate_err_msg_format.format(elem_type))

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘imageio.core.util.Array’>

hmm soz you still have to convert that into tensors

    def __getitem__(self, index):
          x = imageio.imread(self.filenames_OI[index])
          y = imageio.imread(self.filenames_NI[index])
         return torch.from_numpy(X), torch.from_numpy(y)
2 Likes

Hi @JuanFMontesinos

Thanks for your help. I had introduced this conversion; it remove the error but I get another error, which is :


IndexError Traceback (most recent call last)
in
----> 1 for original_img, noisy_img in train_loader:
2 originalImage = original_img
3

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in next(self)
352
353 def next(self):
–> 354 data = self._next_data()
355 self._num_yielded += 1
356 if self._dataset_kind == _DatasetKind.Iterable and \

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data\dataloader.py in _next_data(self)
392 def _next_data(self):
393 index = self._next_index() # may raise StopIteration
–> 394 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
395 if self._pin_memory:
396 data = _utils.pin_memory.pin_memory(data)

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\fetch.py in fetch(self, possibly_batched_index)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
—> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]

~.conda\envs\pytorch2\lib\site-packages\torch\utils\data_utils\fetch.py in (.0)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
—> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]

in getitem(self, index)
24 #noisyIMG = Image.open(self.filenames_NI[index])
25
—> 26 originalIMG = imageio.imread(self.filenames_OI[index])
27 noisyIMG = imageio.imread(self.filenames_NI[index])
28

IndexError: list index out of range

that means that the len of dataset is larger than the real one.

return len(self.path_input_1)

Hi @JuanFMontesinos

Thanks for your help, I have done a print to check, as follow:

‘’’
class Dataset(data.Dataset):
def init(self, path_input_1, path_input_2):

    self.filenames_OI = []
    self.filenames_NI = []
    
    self.path_input_1 = path_input_1
    self.path_input_2 = path_input_2        
    print(len(self.path_input_1))
    
    filenames_OI = glob.glob(osp.join(self.path_input_1, '*.jpg'))
    filenames_NI = glob.glob(osp.join(self.path_input_2, '*.jpg'))
    print(len(filenames_OI)) 
    print(len(filenames_NI))
    
    for fn_OI, fn_NI in zip(filenames_OI, filenames_NI):
        self.filenames_OI.append(fn_OI)
        self.filenames_NI.append(fn_NI)

def __len__(self):
    return len(self.path_input_1) 

def __getitem__(self, index):
    #originalIMG = Image.open(self.filenames_OI[index])
    #noisyIMG = Image.open(self.filenames_NI[index])
    
    originalIMG = imageio.imread(self.filenames_OI[index])       
    noisyIMG = imageio.imread(self.filenames_NI[index])
    
    return torch.from_numpy(originalIMG), torch.from_numpy(noisyIMG)

‘’’

I get as result
53
6
6

I’ve been trying to figure out where this 53 is coming from, and I found it calculates the length of the string (’/DataSets/SimplifyData’’/imagesForTrain/OriginalImages/’) of the code:

 datadir  = '/DataSets/SimplifyData'
  traindir_OI = datadir + '/imagesForTrain/OriginalImages/'

Do you known why and how to solve it.

It’s an inputs so there is no way I know what have you coded there :sweat_smile:
If it’s a string thats probably the length of the path (number of characters in the string) (Since the rest of the code is working)

Hi @JuanFMontesinos

It’s the first time I’ve used this class, with:

    def __init__
    def __len__(self):
    def __getitem__(self, index):

I have solved this problem by writing

def __len__(self):
       list = os.listdir(self.path_input_1)
       return len(list) 

Now I get no error. Thanks a lot for your precious help. :slight_smile:

This help me out of “time-out trouble”. Thanks a lot!!!
I also used API from Pil to open the image but I transfered it to torch but the programm will still be crushed due to time-out. After seeing your comment I tried to use imageio to read the images and it worked!

Can you explain a little what is causing this bug? It seems to me that it is just a different way of opening the image, nothing with computation. Maybe some issue with Pillow library?

Sorry this post is 2 years old, could you give me some context?

I got an error while loading Wine dataset using DataLoader. The problem I was facing was “Empty” like listed above. When I removed num_workers=2 parameter completely, it started working. Even when value was set to 1, it was still throwing the error.