Add sigmoid layer

Chamroukhi · August 5, 2021, 2:04pm

Hello,
im using an model pretrained i need to add classifier layer but i don’t understand how:
1- it’s Linear layer apply softmax automatically ?
2- can i use Linear layer and after this layer add a Softmax layer ?:

model = models.video.mc3_18(pretrained=True, progress=True)
        set_parameter_requires_grad(model, feature_extract)
        #change output layer FC
        model.fc = nn.Sequential(nn.Linear(512, 256),
                                 nn.ReLU(),
                                 nn.Linear(256,num_classes),
                                nn.Softmax())

3 - it’s ok to use crossentropy loss ?
thank you

Manuel_Alejandro_Dia · August 5, 2021, 2:40pm

Its Linear layer applies softmax automatically ?

No, a Linear layer just applies the weights and biases for the inputs.

Can i use Linear layer and after this layer add a Softmax layer ?

It is best that you don’t, because it may give you a problem when calculating the loss. You could set a condition in your model so it does the softmax operation if it is not training:

if model.training:
    return output
else:
    return torch.nn.functional.softmax(output)

Is it ok to use crossentropy loss ?

In order to train, you will need to give the raw output from nn.Linear(256, num_classes) to the Cross Entropy loss object, since the loss implementation applies LogSoftmax to calculate the loss: nn.CrossEntropyLoss

Chamroukhi · August 5, 2021, 3:53pm

Thank you @Manuel_Alejandro_Dia,
it is possible to use this head ? :

model.fc = nn.Sequential(nn.Linear(512, 256),
                                 nn.Softmax())

Manuel_Alejandro_Dia · August 5, 2021, 4:12pm

Glad I could help!

You cannot use that head since it has a Softmax layer.

I would recommend something like:

model.fc = nn.Sequential(nn.Linear(512, 256),
                         nn.ReLU(),
                         nn.Linear(256,num_classes),
                         nn.ReLU()
                         )

Rmember that you don’t have to add the softmax operation in your nn.Sequential since it could interfere with the softmax done on CrossEntropyLoss

Chamroukhi · August 5, 2021, 5:33pm

Relu() generate probability?, can i use max of probability like predicted class ?

Manuel_Alejandro_Dia · August 5, 2021, 6:09pm

What ReLU does is filter the negative values. What you can do if you just want to get the predicted class is to apply output.argmax(1) (or max, as you said)