My model is predicting everything as background

Hi! I am currently working to develop a model that can hopefully take an RGB image containing garbage and outputs a segmented image (i am hoping to segment it according to the 60 categories I defined). I used Adam as the optimizer and cross-entropy as the loss_function. During training, (I tried 10 epoch at most), the lowest loss value I got was 0.14. This is my training script:

BATCH_SIZE = 4
EPOCHS = 10

def train(model):
  model.train()
  for epoch in range(EPOCHS):
      for i in tqdm(range(0, len(img_train), BATCH_SIZE)): 
          batch_img_train = img_train[i:i+BATCH_SIZE].view(-1, 3, 448, 448)
          batch_mask_train = mask_train[i:i+BATCH_SIZE].view(-1, 1, 448, 448)
        
          model.zero_grad()

          outputs = model(batch_img_train)
        
          loss = loss_function(outputs, batch_mask_train.squeeze(1).long())
          loss.backward()
          optimizer.step()    # Does the update

      print(f"Epoch: {epoch}. Loss: {loss}")
  
  return batch_img_train, batch_mask_train, outputs

train(model)

while this is my validation script ( I also tried to print predictions here):

def test(model):
  model.eval
  correct = 0
  total = 0

  with torch.no_grad():
      for i in tqdm(range(len(img_test))):

          
          real_class = mask_test[20]
          net_out = model(img_test[i].view(-1, 3, 224, 224))[0]
          predicted_class = torch.argmax(net_out, 0)
          
          prediction = predicted_class.eq(mask_test[20])

          
          plt.figure(figsize=(22,8))

          
          plt.subplot(1, 3, 1)
          plt.imshow(img_test[20].view(-1, 224, 224, 3).detach().cpu().squeeze())
          plt.title('original image')

          plt.subplot(1, 3, 2)
          plt.imshow(real_class.cpu().squeeze())
          plt.title('ground truth')

          plt.subplot(1, 3, 3)
          plt.imshow(predicted_class.cpu().squeeze())
          plt.title('predicted mask')
          

      
test(model)

but my predictions are way to false because my model tends to classify every pixel as backgroudn despite having a small loss value. I read about class imbalance and even tried implementing IoU in the training but I couldnt get it right. I am very confused right now, can someone please please help me?

Hi Cassie!

Class imbalance is likely an issue you will need to deal with. Have
you tried the weight argument passed to the CrossEntropyLoss
constructor? One typically weights each class in inverse proportion
to its frequency of appearance. (If you are using the functional version
of cross entropy, torch.nn.functional.cross_entropy(), it also
has such a weight argument.)

One experiment you could try is to combine your 60 different categories
into one “garbage” class, and train a two-class problem: background
(i.e., not garbage) vs. garbage. This will reduce the class imbalance,
because your garbage class will occur much more frequently than
any individual one of your 60 classes. Of course, you may still have a
background / garbage imbalance, and may still want to use the weight
argument.

If you try this experiment, I would advise that you still treat this as a
multi-class problem (that happens to have only two classes), and
still use CrossEntropyLoss, rather than reformulating it as a binary
classification problem (that uses something like BCEWithLogitsLoss).
The idea is to get your current code working on a problem without as
much class imbalance, only changing your code as much as necessary.

Good luck.

K. Frank

Thank you so much for your response! I will read about your suggestion on using weight argument in my loss function since that idea is very new to me. I’ll get back here after I try this.

Hello!

I have tried using weighted cross entropy and my training script looks like this:

weights = [0.2939, 0.0105, 0.0005, 0.0015, 0.0002, 0.0015, 0.0551, 0.0181, 0.0453, 0.0135, 0.0037, 0.0049, 0.0024, 0.0370, 0.0007, 0.0193, 0.0022, 0.0100, 0.0100, 0.0066, 0.0007, 0.0152, 0.0223, 0.0027, 0.0010, 0.0000, 0.0017, 0.0010, 0.0169, 0.0015, 0.0419, 0.0012, 0.0007, 0.0027, 0.0167, 0.0056, 0.0000, 0.0759, 0.0010, 0.0049, 0.0054, 0.0120, 0.0007, 0.0086, 0.0022, 0.0010, 0.0091, 0.0032, 0.0005, 0.0010, 0.0071, 0.0174, 0.0069, 0.0029, 0.0015, 0.0015, 0.0269, 0.0010, 0.0191, 0.0659, 0.0556]
class_weights = torch.FloatTensor(weights)

import torch.optim as optim
optimizer = optim.Adam(model.parameters(), lr = 3e-5)
loss_CE = nn.CrossEntropyLoss(weight=class_weights)

BATCH_SIZE = 4
EPOCHS = 2


def train(model):
  model.train()
  for epoch in range(EPOCHS):
      for i in tqdm(range(0, len(img_train), BATCH_SIZE)): 
          batch_img_train = img_train[i:i+BATCH_SIZE].view(-1, 3, 112, 112)
          batch_mask_train = mask_train[i:i+BATCH_SIZE].view(-1, 1, 112, 112)
        
          model.zero_grad()

          outputs = model(batch_img_train)
          loss_ce= loss_CE(outputs, batch_mask_train.squeeze(1).long())
          loss = loss_ce
        
          loss.backward()
          optimizer.step()    # Does the update

      print(f"Epoch: {epoch}. Loss: {loss}")

train(model)

Am I doing it right?

Hello,

What is your prediction output, you only get output class as result of prediction ? If your answer is no, doing color mapping on your classes will solve your problem.

Hi! This is how my code for trying to show sample prediction:

def test(model):
  model.eval
  correct = 0
  total = 0

  with torch.no_grad():
      for i in tqdm(range(len(img_test))):

          
          real_class = mask_test[i]
          net_out = model(img_test[i].view(-1, 3, 224, 224))[0]
          _, predicted_class = torch.max(net_out, 0)
          prediction = predicted_class.eq(mask_test[i])
        

          
          plt.figure(figsize=(22,8))

          
          plt.subplot(1, 3, 1)
          plt.imshow(img_test[i].view(-1, 224, 224, 3).detach().cpu().squeeze())
          plt.title('original image')

          plt.subplot(1, 3, 2)
          plt.imshow(real_class.cpu().squeeze())
          plt.title('ground truth')

          plt.subplot(1, 3, 3)
          plt.imshow(predicted_class.cpu().squeeze())
          plt.title('predicted mask')
          
          break

      
test(model)

I have already tried using weighted cross entropy but the result was the same my model predicts everything as a background

how do I do color mapping?

I assume that your prediction is consist more results except background prediction. Here is a snippet to colorize your output.

label_colors = np.array([(0, 0, 0), 
               # 0=background
               # 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle
               (128, 0, 0), (0, 128, 0), (128, 128, 0), (0, 0, 128), (128, 0, 128),
               # 6=bus, 7=car, 8=cat, 9=chair, 10=cow
               (0, 128, 128), (128, 128, 128), (64, 0, 0), (192, 0, 0), (64, 128, 0),
               # 11=dining table, 12=dog, 13=horse, 14=motorbike, 15=person
               (192, 128, 0), (64, 0, 128), (192, 0, 128), (64, 128, 128), (192, 128, 128),
               # 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor
               (0, 64, 0), (128, 64, 0), (0, 192, 0), (128, 192, 0), (0, 64, 128)])

  r = np.zeros_like(image).astype(np.uint8)
  g = np.zeros_like(image).astype(np.uint8)
  b = np.zeros_like(image).astype(np.uint8)
  
  for l in range(0, nc):
    idx = image == l
    r[idx] = label_colors[l, 0]
    g[idx] = label_colors[l, 1]
    b[idx] = label_colors[l, 2]
    
  rgb = np.stack([r, g, b], axis=2)
  return rgb

This is a function to decode and plot your prediciton.

For more information about that you can check this link out.

Thank you so much for your help. I will try this out!

Waiting for you feedback, have a good day.

Hi! I tried the codes form the link you gave. I was stuck at this lines:

  real_class = mask_test[20]
          net_out = model(img_test[20].view(-1, 3, 224, 224))
          print(net_out.shape)
          predicted_class = torch.argmax(net_out.squeeze(), dim = 0).detach().cpu().numpy()
          print(predicted_class.shape)
          print(np.unique(predicted_class))

in the line where I tried printing np.unique, I should be getting the prediction already. Because in the link when they printed this certain code, the result was [0 3] and when I checked their color map that corresponds to a bird. But mine always prints out [0], which give push me suspecting that maybe my network indeed is just predicting everything as background. What should I do?

These are my results of printing:

 0%|          | 0/150 [00:00<?, ?it/s]torch.Size([1, 61, 224, 224])
(224, 224)
[0]

Hello :wave: For binary segmentation with high class imbalance, I believe the Dice loss or Intersection over Union loss are better than the Cross Entropy.

Here is an implementation of both

Note that the mask should be 0 or 1 and the prediction a single number between 0 and 1 (not two like cross entropy).

hi thank you for taking time to reply. Can I still use dice loss if my class is more than 2? I have 60 classes plus background

It is possible to compute IOU and Dice loss with more than two classes, but not with this implementation. Basically, you would compute the ratios for each class (except the background) individually before averaging them (weighted or not).