I have defined a pretrained resnet50 for data parallelism using multiple classes and use nn.CrossEntropyLoss() .
model = models.resnet50(pretrained=True)
model = torch.nn.DataParallel(model)
for p in model.parameters():
p.requires_grad = False
num_ftrs = model.module.fc.in_features
model.module.fc = nn.Linear(num_ftrs, num_classes)
model = model.to(device)
However, I’m unsure of how to use BCE Loss. I have read that it is better to use nn.BCELoss() for two classes, however, I don’t know if I need to define a sigmoid layer. And if so, where to put this layer…
The docs will give you some information about these loss functions as well as small code snippets.
For a binary classification, you could either use
nn.BCE(WithLogits)Loss and a single output unit or
nn.CrossEntropyLoss and two outputs.
nn.CrossEntropyLoss is used for a multi-class classification, but you could treat the binary classification use case as a (multi) 2-class classification, but it’s up to you which approach you would like to use.
If you are using the former approach, we generally recommend to use
nn.BCEWithLogitsLoss and pass raw logits to this criterion, as it will yield better numerical stability than
The latter use case also expects raw logits, which can be passed to
Ah ok cool.
So i’m under the assumption that softmax is automatically computed in the
nn.CrossEntropyLoss module and
nn.BCE(WithLogits)Loss will also have sigmoid computed within, whereas for
nn.BCELoss you need to apply sigmoid first? And if so, I can send the output of
No, this wouldn’t work, since
nn.BCELoss already calculates the loss.
nn.CrossEntropyLoss expects a model output and targets, not another loss.
Apologies, I just understood what you meant!. Many thanks for your help.