What is the appropriate way to use BCE loss with ResNet outputs?

Hello. The title explains the overall problem, but for some elaboration:

I’m using torchvision.models.resnet18() to run an anomaly detection scheme. I initialize the model by doing:

net = torchvision.models.resnet18(num_classes=2)

since in my particular setting 0 equals normal samples and 1 equals anomalous samples.

The output from my model is of shape (16, 2) (batch size is 16) and labels are of size (16, 1). This gives me the error that the two input tensors are of inappropriate shape.

In order to solve this, I tried something like:

>>> new_output = torch.argmax(output, dim=1)

Which gives me the appropriate shape, but running loss = nn.BCELoss(new_output, labels) gives me the error:

RuntimeError: bool value of Tensor with more than one value is ambiguous

What is the appropriate way for me to approach this issue? Thanks.

Hi, you can use handle this problem in two ways,
One way is to get the model’s output of shape (16,1) , and use a nn.BCELoss().
Or get the model’s output to be shaped (16,2) and use nn.CrossEntropyLoss()

In your case, you have decided to have shape (16,2) , so you need to use nn.CrossEntropyLoss() instead of nn.BCELoss()

Hi. I’ve tried nn.CrossEntropyLoss as well with nn.CrossEntropyLoss(output, label) and nn.CrossEntropyLoss(output, label.flatten()) and both cases still return:

RuntimeError: bool value of Tensor with more than one value is ambiguous

Considering two classes, do the label span between 0 and 1 ?

I have a minimal code which works, make sure this code and yours are similar.

logits = torch.from_numpy(np.random.randn(16,2)).float()
target = torch.from_numpy(np.random.randint(0,2, size = 16)).long()

loss = nn.CrossEntropyLoss()
z = loss(logits, target)

Hey, you just need to initialize the conditons inside the nn.CrossEntropyLoss().
You must pass the actual predictions and targets after you instantiate the loss object.

You need to instantiate the loss function object first. Then only you can use it.

criterion = nn.CrossEntropyLoss()
loss = criterion(input, targe)