What is the appropriate way to use BCE loss with ResNet outputs?

seankala · June 13, 2020, 6:06am

Hello. The title explains the overall problem, but for some elaboration:

I’m using torchvision.models.resnet18() to run an anomaly detection scheme. I initialize the model by doing:

net = torchvision.models.resnet18(num_classes=2)

since in my particular setting 0 equals normal samples and 1 equals anomalous samples.

The output from my model is of shape (16, 2) (batch size is 16) and labels are of size (16, 1). This gives me the error that the two input tensors are of inappropriate shape.

In order to solve this, I tried something like:

>>> new_output = torch.argmax(output, dim=1)

Which gives me the appropriate shape, but running loss = nn.BCELoss(new_output, labels) gives me the error:

RuntimeError: bool value of Tensor with more than one value is ambiguous

What is the appropriate way for me to approach this issue? Thanks.

chetan_patil · June 13, 2020, 6:15am

Hi, you can use handle this problem in two ways,
One way is to get the model’s output of shape (16,1) , and use a nn.BCELoss().
Or get the model’s output to be shaped (16,2) and use nn.CrossEntropyLoss()

In your case, you have decided to have shape (16,2) , so you need to use nn.CrossEntropyLoss() instead of nn.BCELoss()

seankala · June 13, 2020, 6:20am

Hi. I’ve tried nn.CrossEntropyLoss as well with nn.CrossEntropyLoss(output, label) and nn.CrossEntropyLoss(output, label.flatten()) and both cases still return:

RuntimeError: bool value of Tensor with more than one value is ambiguous

chetan_patil · June 13, 2020, 6:31am

Considering two classes, do the label span between 0 and 1 ?

I have a minimal code which works, make sure this code and yours are similar.

logits = torch.from_numpy(np.random.randn(16,2)).float()
target = torch.from_numpy(np.random.randint(0,2, size = 16)).long()

loss = nn.CrossEntropyLoss()
z = loss(logits, target)

chetan_patil · June 13, 2020, 6:33am

Hey, you just need to initialize the conditons inside the nn.CrossEntropyLoss().
You must pass the actual predictions and targets after you instantiate the loss object.

xian_kgx · September 14, 2020, 11:52am

You need to instantiate the loss function object first. Then only you can use it.

criterion = nn.CrossEntropyLoss()
loss = criterion(input, targe)