BCEWithLogitsLoss can't compare Segmentation Mask with predicted mask

I am doing a binary segmentation task. I’m using BCEWithLogitsLoss. I’m using Albumentations to augment and normalize images.

transforms_normalize = albumentations.Compose(
        [
            albumentations.Normalize(mean=normalize['mean'], std=normalize['std'], always_apply=True, p=1),
            albumentations.pytorch.transforms.ToTensorV2()
        ],
        additional_targets={'ela':'image'}
    )

This loads two images and a segmenattion mask. The images are returned as FloatTensor and mask is returned as ByteTensor.

When I try to calculate loss the following error is thrown

Traceback (most recent call last):
  File "/media/sowmitra/SSD Disk/image_manipulation/train_segment.py", line 559, in <module>
    resume=False
  File "/media/sowmitra/SSD Disk/image_manipulation/train_segment.py", line 243, in train
    train_metrics = train_epoch(model, train_loader, optimizer, criterion, epoch, SRM_FLAG)
  File "/media/sowmitra/SSD Disk/image_manipulation/train_segment.py", line 404, in train_epoch
    loss_segmentation = criterion(out_mask, gt)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
    return self.first(*input) + self.second(*input)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
    return self.loss(*input) * self.weight
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 632, in forward
    reduction=self.reduction)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/functional.py", line 2580, in binary_cross_entropy_with_logits
    raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
ValueError: Target size (torch.Size([8, 256, 256])) must be the same as input size (torch.Size([8, 1, 256, 256]))

To fix this I used squeeze(1) on the output tensor to remove the second dim. Should I squeeze the output or should I expand the ground-truth mask?

After doing that, this next error is thrown:

Traceback (most recent call last):
  File "/media/sowmitra/SSD Disk/image_manipulation/train_segment.py", line 559, in <module>
    resume=False
  File "/media/sowmitra/SSD Disk/image_manipulation/train_segment.py", line 243, in train
    train_metrics = train_epoch(model, train_loader, optimizer, criterion, epoch, SRM_FLAG)
  File "/media/sowmitra/SSD Disk/image_manipulation/train_segment.py", line 404, in train_epoch
    loss_segmentation = criterion(out_mask, gt)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 32, in forward
    return self.first(*input) + self.second(*input)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/pytorch_toolbelt/losses/joint_loss.py", line 18, in forward
    return self.loss(*input) * self.weight
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 632, in forward
    reduction=self.reduction)
  File "/home/sowmitra/anaconda3/envs/dfdcpy37env/lib/python3.7/site-packages/torch/nn/functional.py", line 2582, in binary_cross_entropy_with_logits
    return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
RuntimeError: result type Float can't be cast to the desired output type Byte

I’m guessing this is because the model returns logits of size (nbatch,1,256,256) as FloatTensors since they are probabilities but the masks are (nbatch,256,256) ByteTensors. Should I just convert the masks to Float? Or should I threshold the outputs and convert it to Byte?

For a multi-class segmentation use case, you should use nn.CrossEntropyLoss where the model output should be a FloatTensor in the shape [batch_size, nb_classes, height, width] and the target a LongTensor in the shape [batch_size, height, width] containing the class indices in the range [0, nb_classes-1].
If you are dealing with a binary or multi-label segmentation, you could use nn.BCEWithLogitsLoss, which would expect a target as a FloatTensor in the same shape as the model output containing values in [0, 1].