Hi,
I have defined a pretrained resnet50 for data parallelism using multiple classes and use nn.CrossEntropyLoss() .
model = models.resnet50(pretrained=True)
model = torch.nn.DataParallel(model)
for p in model.parameters():
p.requires_grad = False
num_ftrs = model.module.fc.in_features
model.module.fc = nn.Linear(num_ftrs, num_classes)
model = model.to(device)
However, I’m unsure of how to use BCE Loss. I have read that it is better to use nn.BCELoss() for two classes, however, I don’t know if I need to define a sigmoid layer. And if so, where to put this layer…
Cheers,
T
The docs will give you some information about these loss functions as well as small code snippets.
For a binary classification, you could either use nn.BCE(WithLogits)Loss
and a single output unit or nn.CrossEntropyLoss
and two outputs.
Usually nn.CrossEntropyLoss
is used for a multi-class classification, but you could treat the binary classification use case as a (multi) 2-class classification, but it’s up to you which approach you would like to use.
If you are using the former approach, we generally recommend to use nn.BCEWithLogitsLoss
and pass raw logits to this criterion, as it will yield better numerical stability than sigmoid
+ nn.BCELoss
.
The latter use case also expects raw logits, which can be passed to nn.CrossEntropyLoss
.
2 Likes
Ah ok cool.
So i’m under the assumption that softmax is automatically computed in the nn.CrossEntropyLoss
module and nn.BCE(WithLogits)Loss
will also have sigmoid computed within, whereas for nn.BCELoss
you need to apply sigmoid first? And if so, I can send the output of sigmoid
+ nn.BCELoss
to nn.CrossEntropyLoss
?
No, this wouldn’t work, since nn.BCELoss
already calculates the loss.
nn.CrossEntropyLoss
expects a model output and targets, not another loss.
1 Like
Apologies, I just understood what you meant!. Many thanks for your help.