Data type mismatch in loss function

I used the BCEWithLogitsLoss() as the loss function to do a binary classification. However, error message tells me that

output with type torch.cuda.LongTensor doesn’t match the desired type torch.cuda.FloatTensor

I thought my output layer is already float type. So I don’t understand why it says the output is long type. Here is my output (batch size is 30)

tensor([-0.0114, -0.0117, -0.0117, -0.0118, -0.0114, -0.0118, -0.0118, -0.0117,
        -0.0117, -0.0117, -0.0117, -0.0115, -0.0116, -0.0117, -0.0117, -0.0117,
        -0.0117, -0.0115, -0.0117, -0.0115, -0.0115, -0.0117, -0.0117, -0.0117,
        -0.0117, -0.0117, -0.0115, -0.0117, -0.0116, -0.0114], device='cuda:0',
       grad_fn=<SqueezeBackward0>)

The labels looks

tensor([0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1,
        1, 1, 0, 1, 0, 0], device='cuda:0')

My NN is:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1=nn.Conv2d(1,8,13,stride=[2,3])
        self.pool1=nn.MaxPool2d(2,2)
        self.conv2=nn.Conv2d(8,16,[4,11],stride=[2,3])
        self.pool2=nn.MaxPool2d(2,2)
        self.conv3=nn.Conv2d(16,32,[3,5],stride=[1,3])
        self.pool3=nn.MaxPool2d([1,2],[1,2])
        self.conv4=nn.Conv2d(32,64,[1,3],stride=[1,2])
        self.pool4=nn.MaxPool2d([1,3],[1,2])
        self.fc1=nn.Linear(12*13*64,1)
        
    def forward(self,x):
        x = self.pool1(F.relu(self.conv1(x)))
        x = self.pool2(F.relu(self.conv2(x)))
        x = self.pool3(F.relu(self.conv3(x)))
        x = self.pool4(F.relu(self.conv4(x)))
        x=x.view(-1,12*13*64)
        x=self.fc1(x)
        x=x.squeeze();
        return x

The last x.squeeze() is because BCE requires the same dimension for output and label. Otherwise the output will be (30,1) tensor.

Here is all the error message:

RuntimeError                              Traceback (most recent call last)
<ipython-input-210-2933dbf1b675> in <module>
     16         # forward + backward + optimize
     17         outputs = net(inputs)
---> 18         loss = criterion(outputs, labels)
     19         loss.backward()
     20         optimizer.step()

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torch\nn\modules\loss.py in forward(self, input, target)
    593                                                   self.weight,
    594                                                   pos_weight=self.pos_weight,
--> 595                                                   reduction=self.reduction)
    596 
    597 

~\Anaconda3\lib\site-packages\torch\nn\functional.py in binary_cross_entropy_with_logits(input, target, weight, size_average, reduce, reduction, pos_weight)
   2075         raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
   2076 
-> 2077     return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
   2078 
   2079 

RuntimeError: output with type torch.cuda.LongTensor doesn't match the desired type torch.cuda.FloatTensor

Any comments are highly appreciated! Thank you!

The error points to the target.
Convert it to a torch.cuda.FloatTensor using:

labels = labels.float()

before passing it to the criterion.

4 Likes

Ah, you are right. But just for curiousity, why the label has to be float in this loss function? I mean it is a binary classfication, so one should expect the outcome of such classification is only yes(1) or no(0). Why does it assume the target is float?

It might be because of probabilities? Maybe its 0.9 in yes and 0.1 in no, thus meaning its 90% sure its a yes?

Not sure though :slight_smile:
Regards

Yes, probability can be fraction. But my question is that target is the label of a data point, which is usually labelled by human or some other means. How could a label in supervised learning is shown in probability?

Sometimes it might help training your model and is called label smoothing. Have a look at this paper. However, I’m not sure how often this technique is used “in the wild”.

1 Like

Got the same error. Resolved it using this solution.
Thanks @ptrblck

1 Like

cause the prediciton is float.