I’m trying to fine-tuning vgg16. Then I got the classifier of it:

(I have changed the last output layer.)

```
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace)
(2): Dropout(p=0.5)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace)
(5): Dropout(p=0.5)
(6): Linear(in_features=4096, out_features=1, bias=True)
)
```

My question is that, when use the model to predict, its output is **with no range**. Actually, I got

```
tensor([[ 0.9261],
[ 0.6800],
[ 0.5750],
[ 0.5498],
[ 0.6597],
[ 0.7453],
[ 0.5137],
[ 0.6788],
[ 1.0495],
[ 0.7216],
[-0.2671],
...
```

And I use **nn.BCEWithLogitsLoss()** ,

(*nn.BCEWithLogitsLoss()* is better than *nn.BCELoss()* ? )

so I can’t (shouldn’t) use `output = torch.sigmoid(output)`

and there is no softmax layer in the model. What is the correct way I get the *accuracy*? (The label is 0 or 1.)

The way I think of is,

```
output = torch.sigmoid(output)
if 0 =< output < 0.5:
# prediction is label 0
else:
# prediction is label 1
```

But this makes

- get
**value of loss** by **output data**
- get
**value of accuracy** by **torch.sigmoid(***output data*)

Can I do like this? Does it mean get value of loss and accuracy by **different data**, so it’s not in line with mathematical logic?

Yes, your own answer makes sense

But you can do simpler : comparing the output of the sigmoid to `0.5`

is equivalent to comparing the input of the sigmoid to `0`

! (see wikipedia)

So, you don’t need to call `.sigmoid()`

; just see where `output < 0`

.

1 Like

Oh…

comparing the output of the sigmoid to `0.5`

is equivalent to comparing the input of the sigmoid to `0`

Yes, thank you for your suggestion

But why

- get
**value of loss** by **output data**
- get
**value of accuracy** by **torch.sigmoid( ***output data* )

is the right way for getting value of loss and accuracy?

We should get both **loss** and **accuracy** by **the same data**, isn’t it right?

Because calling `nn.BCEWithLogitsLoss`

is the equivalent of calling first `nn.Sigmoid`

, and then `nn.BCELoss`

.

It’s just that those two are often called one after the other, so they designed `nn.BCEWithLogitsLoss`

that does both, and better.

To cite the docs :

*This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.*

(no idea what it means)

So in both cases (for the loss and for the accuracy) it’s the data resulting of the sigmoid which is taken into account. It’s just hidden beneath a trick for the loss, and for the accuracy you don’t really need to compute it since you can just compare the raw logits with `0`

.

1 Like

Oh…I got it…

Truly appreciate your timely help