Finetune semantic segmentation model on our dataset

Hi there,

do you have a tutorial/guidance on how to finetune provided trained semantic segmentation model of torchvision 0.3 (FCN or DeepLabV3 with Resnet 50 or 101 backbone) on our dataset (transfer learning for semantic segmentation)?

The general logic should be the same for classification and segmentation use cases, so I would just stick to the Finetuning tutorial.
The main difference would be the output shape (pixel-wise classification in the segmentation use case) and the transformations (make sure to apply the same transformations on the input image and mask, e.g. crop).

What about this tutorial:

This one looks even better for your use case! :slight_smile:

I’ve written a tutorial on how to fine-tune DeepLabv3 for semantic segmentation in PyTorch. The tutorial link is:

The github repo link is:


hey your notebook is not loading. giving error. pls share the updated link

I am following your link. I have few queries onto this:

  1. I have 3 classes, even though I tried to change the number of out_channels to 3, still model.eval() shows final layer with 21 classes. Please clarify this.

  2. It has batch size 4. I am trying to train it on single image. So when I change batch size to 1, it is throwing an error.

  3. Can you help to decode the result into a mask? I dont know how to do that.

Please help in this regard. Thank you.

I am following your tutorials, I have following error when I run the code … Do you tell me where is the problem?.