Understanding the error I got while Training my model for semantic segmentation

MA_CASANDRA_QUILANG · January 6, 2021, 2:54am

I have a difficulty understanding the error I got while training. The error goes like this:

 UserWarning: Using a target size (torch.Size([10, 1, 224, 224])) that is different to the input size (torch.Size([10, 60, 224, 224])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)

My target is the mask for every image and its shape is [batch size, channel, dimension] and my input is the OUTPUT OF MY NETWORK and its shape is [batch size, number of classes, dimension]. Can anyone help me to understand this?

caonv · January 6, 2021, 2:57am

the number of channels of the input and of the target is inconsistent. You should change the loss to CrossEntropyLoss

MA_CASANDRA_QUILANG · January 6, 2021, 3:38am

Thank you so much for answering. I tried changing my loss function to cross entropy but I got an error:

RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4

What does this mean?

caonv · January 6, 2021, 3:44am

In the CrossEntropyLoss, the target must be in the shape NxHxW where each element is an integer in range [0, C-1], and target must have the shape NxCxHxW

MA_CASANDRA_QUILANG · January 6, 2021, 3:50am

Can I clarify something? The N here represents number of channel or batch size?

caonv · January 6, 2021, 4:05am

It’s the batch-size. In most cases of Pytorch, except for RNNs as far as I know, the first number always the batch-size.

MA_CASANDRA_QUILANG · January 6, 2021, 4:07am

You repeated mentioning target. Can I clarify if the target is the NxHxW and the input is the NxCxHxW?

caonv · January 6, 2021, 4:12am

Yes, I think you should read the documents again to understand the function better.

MA_CASANDRA_QUILANG · January 6, 2021, 6:31am

Thank you! Can I ask if what values should I expect from my output? my training script goes like this:

BATCH_SIZE = 10

EPOCHS = 1

def train(model):

  model.train()

  for epoch in range(EPOCHS):

      for i in tqdm(range(0, len(img_all), BATCH_SIZE)): 

          batch_img_all = img_all[i:i+BATCH_SIZE].view(-1, 3, 224, 224)

          batch_mask_all = mask_all[i:i+BATCH_SIZE].view(-1, 1, 224, 224)

        

          model.zero_grad()

          outputs = model(batch_img_all)

          batch_mask_all = torch.argmax(batch_mask_all, dim=1)

          loss = loss_function(outputs, batch_mask_all)

          loss.backward()

          optimizer.step()    # Does the update

      print(f"Epoch: {epoch}. Loss: {loss}")

  

  return batch_img_all, batch_mask_all, outputs

IMG, MSK, OUTPUT = train(model)

caonv · January 6, 2021, 6:47am

The outputs depend on the activation function of the output layer in your model. You must be careful when placing this activation function because each type of loss function has its own range of values.

In your code snip, if you use Cross-Entropy Loss, you should not use any activation function in the output layer. However, you took argmax in the channel dimension of your ground-truth which has only 1 channel, this step does not make sense.

MA_CASANDRA_QUILANG · January 6, 2021, 7:02am

Thank you for correcting me. But is there any way I could solve the error

RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4

this is the error I got before using the code argmax

caonv · January 6, 2021, 7:14am

Just convert the batch_mask_all into a 3D tensor by squeeze the dimension 1

MA_CASANDRA_QUILANG · January 6, 2021, 7:16am

Thank you again! I would try this one.

MA_CASANDRA_QUILANG · January 6, 2021, 7:22am

Hey! I am really grateful for your patience in answering my questions. I am really new to this field and would really want to successfuly make my current project work. Anyway, I tried squeezing dimension 1 but I got this error. DO you know what this means? some forums said I should convert my target data type but I am not confident doing it.


RuntimeError: expected scalar type Long but found Float

caonv · January 6, 2021, 7:41am

Yes, the target for CrossEntropy must be in Long while the input must be in Float

MA_CASANDRA_QUILANG · January 6, 2021, 7:45am

Thank you so much for your help! I would try to edit my code. Hopefully it will work this time.

MA_CASANDRA_QUILANG · January 6, 2021, 10:10am

HI! I am back again and I tried converting my input and target data types to Long and Float respectively. But I got an error again:

IndexError                                Traceback (most recent call last)
<ipython-input-15-0148a2dd2dca> in <module>()
     21   return batch_img_all, batch_mask_all, outputs
     22 
---> 23 IMG, MSK, OUTPUT = train(model)

4 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2264         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2265     elif dim == 4:
-> 2266         ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2267     else:
   2268         # dim == 3 or dim > 4

IndexError: Target 60 is out of bounds.

I suspect the value 60 has something to do with my class indices since I have 60 classes. But I cannot comprehend why it has this error when I already declared my number of classes. Can you help me again? Thank you so much.

caonv · January 6, 2021, 10:29am

Due to the fact that if you have N foreground classes (1, 2, 3, …, N), your model must produce a segmentation map of N+1 channels in which the first channel is for background class (0).

MA_CASANDRA_QUILANG · January 6, 2021, 11:06am

The solution I did was declare a total of 61 classes. Does my solution makes sense? And by the way I tried training just 1 epoch just to see if my solution will make my code work and it gave me this result:

100%|██████████| 150/150 [16:06<00:00,  6.45s/it]Epoch: 0. Loss: 4.002954483032227

is it logical to have this kind of value for loss? what should be the loss value range?