Hello everyone! I wanted to clarify a doubt I have regarding the vgg16 network. I am currently using the pre-trained vgg16 network for a classification problem with 2 labels. I already have the best weights for tthis problem, using as a criterion the nn.CrossEntropyLoss and I can get a prediction by doing:
outputs = vgg16(net_img)
_, preds = torch.max(outputs.data, 1)
However, my goal is not to have a binary prediction (0 or 1), but the probability and also the cross entropy metric for each class. I wanted to check if what I am doing makes sense.
To get the probability of each class I am doing:
probabilities = torch.sigmoid(outputs)
And to get the cross entropy os each class I am simply using the outputs I already calculated.
Does this make sense? If it helps, I based my work in this tutorial: https://www.kaggle.com/carloalbertobarbano/vgg16-transfer-learning-pytorch
Thank you in advance!
To get the probabilities, you should probably use
probs = F.softmax(outputs, dim=1), since you are using
nn.CrossEntropyLoss as the criterion which means your output should have the shape
[batch_size, 2] (used in a multi-class classification).
Thank you for your help! But without the F.softmax layer, the direct output is the cross entropy value, right? since I am using the CrossEntropyLoss as criterion
Without the softmax the output would contain logits. I don’t know, is that’s what “cross entropy value” would refer to.
Could you explain a bit, what you mean by this?
It will not contain the cross entropy (loss) between the output and target, as this will be returned by
Sorry, I didn’t explain well. Basically I wanted to evaluate my model with something similar to this: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html
If I understand the docs correctly, probabilities are expected as the model output, so it seems that applying a softmax should work.
However, this part of the docs:
the probabilities provided are assumed to be that of the positive class
sounds as if the loss function is dealing with a multi-label output (each output would give a probability in
[0, 1] for a separate class)? Can you verify it?