How to get the accuracy without softmax layer?

shirui-japina · September 17, 2019, 12:44pm

I’m trying to fine-tuning vgg16. Then I got the classifier of it:
(I have changed the last output layer.)

(classifier): Sequential(
  (0): Linear(in_features=25088, out_features=4096, bias=True)
  (1): ReLU(inplace)
  (2): Dropout(p=0.5)
  (3): Linear(in_features=4096, out_features=4096, bias=True)
  (4): ReLU(inplace)
  (5): Dropout(p=0.5)
  (6): Linear(in_features=4096, out_features=1, bias=True)
)

My question is that, when use the model to predict, its output is with no range. Actually, I got

tensor([[ 0.9261],
        [ 0.6800],
        [ 0.5750],
        [ 0.5498],
        [ 0.6597],
        [ 0.7453],
        [ 0.5137],
        [ 0.6788],
        [ 1.0495],
        [ 0.7216],
        [-0.2671],
...

And I use nn.BCEWithLogitsLoss() ,
(nn.BCEWithLogitsLoss() is better than nn.BCELoss() ? )
so I can’t (shouldn’t) use output = torch.sigmoid(output) and there is no softmax layer in the model. What is the correct way I get the accuracy? (The label is 0 or 1.)

shirui-japina · September 17, 2019, 1:55pm

The way I think of is,

output = torch.sigmoid(output)
if 0 =< output < 0.5:
    # prediction is label 0
else:
    # prediction is label 1

But this makes

get value of loss by output data
get value of accuracy by torch.sigmoid(output data)

Can I do like this? Does it mean get value of loss and accuracy by different data, so it’s not in line with mathematical logic?

phan_phan · September 17, 2019, 2:09pm

Yes, your own answer makes sense
But you can do simpler : comparing the output of the sigmoid to 0.5 is equivalent to comparing the input of the sigmoid to 0 ! (see wikipedia)
So, you don’t need to call .sigmoid() ; just see where output < 0.

shirui-japina · September 17, 2019, 2:27pm

Oh…

comparing the output of the sigmoid to 0.5 is equivalent to comparing the input of the sigmoid to 0

Yes, thank you for your suggestion
But why

get value of loss by output data

get value of accuracy by torch.sigmoid( output data )

is the right way for getting value of loss and accuracy?
We should get both loss and accuracy by the same data, isn’t it right?

phan_phan · September 17, 2019, 5:12pm

Because calling nn.BCEWithLogitsLoss is the equivalent of calling first nn.Sigmoid, and then nn.BCELoss.
It’s just that those two are often called one after the other, so they designed nn.BCEWithLogitsLoss that does both, and better.
To cite the docs :
This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.
(no idea what it means)

So in both cases (for the loss and for the accuracy) it’s the data resulting of the sigmoid which is taken into account. It’s just hidden beneath a trick for the loss, and for the accuracy you don’t really need to compute it since you can just compare the raw logits with 0.

shirui-japina · September 17, 2019, 5:22pm

Oh…I got it…
Truly appreciate your timely help