Hi,
Thanks for your quick reply! Yes, this was helpful, but unfortunately it didn’t help me to solve my problem. The pytorch tutorial for finetuning deals mainly with classification and I have some problems transfering this to my segmentation challenge.
It would be great, could you help me again.
I want to use a pretrained DeepLabV3
dlv3 = models.segmentation.deeplabv3_resnet101(pretrained=True)
which is trained on 21 categories. 2 would be enough for me. In the first phase my NN shall only be able to segment streets on an image and the rest can be background.
I think I need to change the last layers of the model (the classifier part), but I am not sure how and in particular which:
(classifier): DeepLabHead(
(0): ASPP(
(convs): ModuleList(
(0): Sequential(
(0): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(1): ASPPConv(
(0): Conv2d(2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(12, 12), dilation=(12, 12), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(2): ASPPConv(
(0): Conv2d(2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(24, 24), dilation=(24, 24), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(3): ASPPConv(
(0): Conv2d(2048, 256, kernel_size=(3, 3), stride=(1, 1), padding=(36, 36), dilation=(36, 36), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(4): ASPPPooling(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU()
)
)
(project): Sequential(
(0): Conv2d(1280, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.5, inplace=False)
)
)
(1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU()
(4): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
)
(aux_classifier): FCNHead(
(0): Conv2d(1024, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(3): Dropout(p=0.1, inplace=False)
(4): Conv2d(256, 21, kernel_size=(1, 1), stride=(1, 1))
)
For each training image I have got a dense semantic label id map (s. below).
And then - is it enough to set the parameters to requires_grad = false (except the new classifier layers) and train the complete model with a 3 channel RGB and the dense label id map picture as a target? What else do I have to think of?
Unfortunately I am a little bit lost right now. All beginnings are difficult I have to realize… :-/
Thank you very much in advance!
BR