Loading Semantic Segmentation Training Datasets for FCN

Hi Guys
I want to train FCN for semantic segmentation so my training data (CamVid) consists of photos (.png) and semantic labels (.png) which are located in 2 different files (train and train_lables). The format of a training dataset used in this code below is csv which is not my case and I tried to change it in order to load my training data but new pieces of codes did not get matched with the entire code and got errors many times. Could you please help me to find out how I can load my dataset as I have been struggling a lot

  
  def __init__(self, csv_file, phase):
    self.phase = phase
    self.data = pd.read_csv(csv_file)
    
    if phase == 'train':
      self.input_shape = (480, 640)
    elif phase == 'valid':
      self.input_shape = (704, 960)
  
  def __getitem__(self, index):
    image, label = self.data.iloc[index, 0], self.data.iloc[index, 1]
    image = scipy.misc.imread("data/"+image, mode='RGB')
    label = np.load("data/"+label)
        
    if self.phase == "train":
      # RandomCrop
      h, w, _ = image.shape
      new_h, new_w = self.input_shape
      
      top = random.randint(0, h - new_h)
      left = random.randint(0, w - new_w)

      image = image[top:top + new_h, left:left + new_w]
      label = label[top:top + new_h, left:left + new_w]
      
      # RandomHorizontalFlip
      if random.random() < 0.5:
        image = np.fliplr(image)
        label = np.fliplr(label)
        
    if self.phase == "valid":
      # "Resize"
      new_h, new_w = self.input_shape
      
      image = image[0:new_h, 0:new_w]
      label = label[0:new_h, 0:new_w]
          
    # Normalization 
    mean=[0.485, 0.456, 0.460]
    std =[0.229, 0.224, 0.225]
    
    image = np.transpose(image, (2, 0, 1)) / 255.
    image[0] = (image[0] - mean[0]) / std[0]
    image[1] = (image[1] - mean[1]) / std[1]
    image[2] = (image[2] - mean[2]) / std[2]
    
    # ToTensor 
    image = torch.from_numpy(image.copy()).float()
    label = torch.from_numpy(label.copy()).long()
  
    # One-hot encoding 
    h, w = label.size()
    target = torch.zeros(32, h, w)
    for c in range(32):
      target[c][label == c] = 1
      
    return {'x': image, 'y':target, 'l':label} 
  
  def __len__(self):
    return len(self.data) 

train_data = CamVid('./data/CamVid/train.csv', phase="train")
valid_data = CamVid('./data/CamVid/valid.csv', phase="valid")

trainloader = torch.utils.data.DataLoader(train_data, batch_size=5, shuffle=True)
validloader = torch.utils.data.DataLoader(valid_data, batch_size=1)```

What kind of errors are you getting?

PS: you can post code snippets by wrapping them into three backticks ```, which would make debugging easier. :wink:

1 Like

Hi,
Thanks for replying.
I am trying to change this code (which is generated for csv training dataset) to take png inputs as the format of my training and validation datasets (CamVid image frames+ Semantic Labels) both are in png. I tried different approaches such as PIL.open() and cv2.imread() to adopt it with my dataset but it didn’t happen however It took me around 6 hours
I would appreciate your help
This is the link for the dataset:

class CamVid(Dataset):
  
  def __init__(self, csv_file, phase):
    self.phase = phase
    self.data = pd.read_csv(csv_file)
    
    if phase == 'train':
      self.input_shape = (480, 640)
    elif phase == 'valid':
      self.input_shape = (704, 960)
  
  def __getitem__(self, index):
    image, label = self.data.iloc[index, 0], self.data.iloc[index, 1]
    image = scipy.misc.imread("data/"+image, mode='RGB')
    label = np.load("data/"+label)
        
    if self.phase == "train":
      # RandomCrop
      h, w, _ = image.shape
      new_h, new_w = self.input_shape
      
      top = random.randint(0, h - new_h)
      left = random.randint(0, w - new_w)

      image = image[top:top + new_h, left:left + new_w]
      label = label[top:top + new_h, left:left + new_w]
      
      # RandomHorizontalFlip
      if random.random() < 0.5:
        image = np.fliplr(image)
        label = np.fliplr(label)
        
    if self.phase == "valid":
      # "Resize"
      new_h, new_w = self.input_shape
      
      image = image[0:new_h, 0:new_w]
      label = label[0:new_h, 0:new_w]
          
    # Normalization 
    mean=[0.485, 0.456, 0.460]
    std =[0.229, 0.224, 0.225]
    
    image = np.transpose(image, (2, 0, 1)) / 255.
    image[0] = (image[0] - mean[0]) / std[0]
    image[1] = (image[1] - mean[1]) / std[1]
    image[2] = (image[2] - mean[2]) / std[2]
    
    # ToTensor 
    image = torch.from_numpy(image.copy()).float()
    label = torch.from_numpy(label.copy()).long()
  
    # One-hot encoding 
    h, w = label.size()
    target = torch.zeros(32, h, w)
    for c in range(32):
      target[c][label == c] = 1
      
    return {'x': image, 'y':target, 'l':label} 
  
  def __len__(self):
    return len(self.data) 

train_data = CamVid("D:\\Data Science\\Python Assignment\\Pytorch\\FCN-master\\CamVid\\train", phase="train")
valid_data = CamVid("D:\\Data Science\\Python Assignment\\Pytorch\\FCN-master\\CamVid\\val", phase="valid")

trainloader = torch.utils.data.DataLoader(train_data, batch_size=5, shuffle=True)
validloader = torch.utils.data.DataLoader(valid_data, batch_size=1)```

You still haven’t posted the error message. :stuck_out_tongue:

However, based on the structure of the dataset, it seems that the input images and target masks are images which are stored in the train and train_labels folders, respecively.
In your current code snippet you are trying to pass a csv file and read the the image and label from columns 0 and 1.
Could you explain the code a bit and where this csv can be found?

There is a class_dict.csv, which seems to contain a color mapping to map the target mask colors to class indices.

1 Like

I want to input Camvid dataset (which I shared above) into jupyter notebook, create dataloader and train FCN 8s but the code is not compatible with this dataset so I want to change this part of code (which I shared above). The CamVid dataset consists of Train image frames .png+ TrainSemantic Labels.png, Validation image frames .png+ Validation Semantic Labels.png, Test image frames .png+ Test Semantic Labels.png and class_dict.csv (where you can find only csv file). Please show me how to create dataloaders of CamVid dataset to train FCN 8s. When I use the CamVid dataset, I get this error:

PermissionError                           Traceback (most recent call last)
<ipython-input-2-29608691d1e6> in <module>
     62     return len(self.data)
     63 
---> 64 train_data = CamVid("D:\\Data Science\\Python Assignment\\Pytorch\\FCN-master\\CamVid\\train", phase="train")
     65 valid_data = CamVid("D:\\Data Science\\Python Assignment\\Pytorch\\FCN-master\\CamVid\\val", phase="valid")
     66 

<ipython-input-2-29608691d1e6> in __init__(self, csv_file, phase)
      3   def __init__(self, csv_file, phase):
      4     self.phase = phase
----> 5     self.data = pd.read_csv(csv_file)
      6 
      7     if phase == 'train':

c:\users\sepehr\appdata\local\programs\python\python38\lib\site-packages\pandas\io\parsers.py in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
    684     )
    685 
--> 686     return _read(filepath_or_buffer, kwds)
    687 
    688 

c:\users\sepehr\appdata\local\programs\python\python38\lib\site-packages\pandas\io\parsers.py in _read(filepath_or_buffer, kwds)
    450 
    451     # Create the parser.
--> 452     parser = TextFileReader(fp_or_buf, **kwds)
    453 
    454     if chunksize or iterator:

c:\users\sepehr\appdata\local\programs\python\python38\lib\site-packages\pandas\io\parsers.py in __init__(self, f, engine, **kwds)
    934             self.options["has_index_names"] = kwds["has_index_names"]
    935 
--> 936         self._make_engine(self.engine)
    937 
    938     def close(self):

c:\users\sepehr\appdata\local\programs\python\python38\lib\site-packages\pandas\io\parsers.py in _make_engine(self, engine)
   1166     def _make_engine(self, engine="c"):
   1167         if engine == "c":
-> 1168             self._engine = CParserWrapper(self.f, **self.options)
   1169         else:
   1170             if engine == "python":

c:\users\sepehr\appdata\local\programs\python\python38\lib\site-packages\pandas\io\parsers.py in __init__(self, src, **kwds)
   1996         kwds["usecols"] = self.usecols
   1997 
-> 1998         self._reader = parsers.TextReader(src, **kwds)
   1999         self.unnamed_cols = self._reader.unnamed_cols
   2000 

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()

pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

PermissionError: [Errno 13] Permission denied: 'D:\\Data Science\\Python Assignment\\Pytorch\\FCN-master\\CamVid\\train'

The PermissionError might be raised, if you are trying to open a folder as a file, which seems to be the case for your code snippet, since you are trying to open a csv, but are passing the folder path instead.

To load the images, you could get all image paths in the CamVid.__init__ method e.g. via glob.glob or using os.walk.
Once you have all image paths, you could load each sample in CamVid.__getitem__ using the stored paths.
This tutorial gives you a good overview on how to create a custom Dataset and pass it to a DataLoader.

1 Like

Thank you. I will have a look at the tutorial.