When using BCELoss I'm getting error, while using CrossEntropyLoss all works ok

laro · November 19, 2021, 9:53am

I created a classification model (2 classes) and I can train it with loss function: CrossEntropyLoss with no errors.

When I change the loss function to BCELoss (which seems more correct because I have only 2 classes), I’m getting error:

ValueError: Target size (torch.Size([128])) must be the same as input size (torch.Size([128, 2]))

My model looks:

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Sequential                               --                        --
├─Identity: 1-1                          [128, 3, 224, 224]        --
├─Conv2d: 1-2                            [128, 32, 112, 112]       864
├─BatchNorm2d: 1-3                       [128, 32, 112, 112]       64
├─ReLU: 1-4                              [128, 32, 112, 112]       --
├─Dropout2d: 1-5                         [128, 32, 112, 112]       --
├─Conv2d: 1-6                            [128, 64, 56, 56]         18,432
├─BatchNorm2d: 1-7                       [128, 64, 56, 56]         128
├─ReLU: 1-8                              [128, 64, 56, 56]         --
├─Dropout2d: 1-9                         [128, 64, 56, 56]         --
├─Conv2d: 1-10                           [128, 128, 28, 28]        73,728
├─BatchNorm2d: 1-11                      [128, 128, 28, 28]        256
├─ReLU: 1-12                             [128, 128, 28, 28]        --
├─Dropout2d: 1-13                        [128, 128, 28, 28]        --
├─ResBlock: 1-14                         [128, 128, 28, 28]        --
│    └─Sequential: 2-1                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-1                  [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-2             [128, 128, 28, 28]        256
│    │    └─ReLU: 3-3                    [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-4                  [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-5             [128, 128, 28, 28]        256
│    │    └─ReLU: 3-6                    [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-7                  [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-8             [128, 128, 28, 28]        256
├─Dropout2d: 1-15                        [128, 128, 28, 28]        --
├─ResBlock: 1-16                         [128, 128, 28, 28]        --
│    └─Sequential: 2-2                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-9                  [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-10            [128, 128, 28, 28]        256
│    │    └─ReLU: 3-11                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-12                 [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-13            [128, 128, 28, 28]        256
│    │    └─ReLU: 3-14                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-15                 [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-16            [128, 128, 28, 28]        256
├─Dropout2d: 1-17                        [128, 128, 28, 28]        --
├─ResBlock: 1-18                         [128, 128, 28, 28]        --
│    └─Sequential: 2-3                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-17                 [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-18            [128, 128, 28, 28]        256
│    │    └─ReLU: 3-19                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-20                 [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-21            [128, 128, 28, 28]        256
│    │    └─ReLU: 3-22                   [128, 128, 28, 28]        --
│    │    └─Conv2d: 3-23                 [128, 128, 28, 28]        147,456
│    │    └─BatchNorm2d: 3-24            [128, 128, 28, 28]        256
├─Dropout2d: 1-19                        [128, 128, 28, 28]        --
├─AdaptiveAvgPool2d: 1-20                [128, 128, 1, 1]          --
├─Flatten: 1-21                          [128, 128]                --
├─Linear: 1-22                           [128, 2]                  258

If I cahnge the loss function to CrossEntropyLoss it works.

Why I’m getting error when changing the loss to BCELoss ?
What do I need to change in order to work with BCELoss ?

my3bikaht · November 19, 2021, 10:32am

Binary cross entropy loss compares prediction vs single class, but your model output provides predictions for two classes.

BCELoss responds with error that you need either to :
-provide targets per each class ([128,2]), or
-change your predictions to single class only (model output [128,1], target [128,1])