Need help for first steps "Semantic Segmentation"

Hi,

I want to build my first neural network for semantic segmentation and I am looking for
a good base to start, but all repos that I found on Github up to now were poorly explained
and / or had a lot of issues. By the way, I would like to use a pretrained model.

Do you have any recommendations for me regarding a good repo or an advice for the
first steps I should take? Thank you very much in advance!

BR

This has been discussed previously here; that may be of some help to you.

Hi,

Thanks for your quick reply! Yes, this was helpful, but unfortunately it didn’t help me to solve my problem. The pytorch tutorial for finetuning deals mainly with classification and I have some problems transfering this to my segmentation challenge.

It would be great, could you help me again.

I want to use a pretrained DeepLabV3

dlv3 = models.segmentation.deeplabv3_resnet101(pretrained=True)

which is trained on 21 categories. 2 would be enough for me. In the first phase my NN shall only be able to segment streets on an image and the rest can be background.

I think I need to change the last layers of the model (the classifier part), but I am not sure how and in particular which:


(classifier): DeepLabHead(
(0): ASPP(
(convs): ModuleList(
(0): Sequential(
(0): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(1): ASPPConv(
(0): Conv2d(2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(12, 12), dilation=(12, 12), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(2): ASPPConv(
(0): Conv2d(2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(24, 24), dilation=(24, 24), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(3): ASPPConv(
(0): Conv2d(2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(36, 36), dilation=(36, 36), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(4): ASPPPooling(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU()
)
)
(project): Sequential(
(0): Conv2d(1280, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.5, inplace=False)
)
)
(1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU()
(4): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
)
(aux_classifier): FCNHead(
(0): Conv2d(1024, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.1, inplace=False)
(4): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
)


For each training image I have got a dense semantic label id map (s. below).

And then - is it enough to set the parameters to requires_grad = false (except the new classifier layers) and train the complete model with a 3 channel RGB and the dense label id map picture as a target? What else do I have to think of?

Unfortunately I am a little bit lost right now. All beginnings are difficult I have to realize… :-/

Thank you very much in advance!

BR

In this tutorial that was on the previously linked page, under the header " DeepLabv3 Model" they give a custom function on how to change the ‘head’ of the network to predict however many classes you want. Their example of having only 2 classes, background, and something else, is what you’re doing. So I would stick close to their tutorial.

Hope that helps!

Hi

and again thanks for your quick response. I really appreciate it. Yes,you are right, this is indeed helpful, but unfortunately this is not really a kind of tutorial I am looking for. You can copy it, change a few parameters and let it run. But it doesn’t really help you to understand the whole thing better. Of course only my opinion.

Thanks again and I hope my next questions will be more concrete. :slightly_smiling_face:

Best regards

Hello,
Are you able to change the number of output channels in the last layer? I am also trying to train deeplabv3 model with my dataset. I am following this link:


But the result still shows 21 output channels. Can anyone please help me out with this? I have 3 class labels for my dataset.
If possible suggest some other github repo or a link to perform semantic segmentation.