Data type mismatch in loss function

Jake_Pan · January 17, 2019, 3:41am

I used the BCEWithLogitsLoss() as the loss function to do a binary classification. However, error message tells me that

output with type torch.cuda.LongTensor doesn’t match the desired type torch.cuda.FloatTensor

I thought my output layer is already float type. So I don’t understand why it says the output is long type. Here is my output (batch size is 30)

tensor([-0.0114, -0.0117, -0.0117, -0.0118, -0.0114, -0.0118, -0.0118, -0.0117,
        -0.0117, -0.0117, -0.0117, -0.0115, -0.0116, -0.0117, -0.0117, -0.0117,
        -0.0117, -0.0115, -0.0117, -0.0115, -0.0115, -0.0117, -0.0117, -0.0117,
        -0.0117, -0.0117, -0.0115, -0.0117, -0.0116, -0.0114], device='cuda:0',
       grad_fn=<SqueezeBackward0>)

The labels looks

tensor([0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1,
        1, 1, 0, 1, 0, 0], device='cuda:0')

My NN is:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1=nn.Conv2d(1,8,13,stride=[2,3])
        self.pool1=nn.MaxPool2d(2,2)
        self.conv2=nn.Conv2d(8,16,[4,11],stride=[2,3])
        self.pool2=nn.MaxPool2d(2,2)
        self.conv3=nn.Conv2d(16,32,[3,5],stride=[1,3])
        self.pool3=nn.MaxPool2d([1,2],[1,2])
        self.conv4=nn.Conv2d(32,64,[1,3],stride=[1,2])
        self.pool4=nn.MaxPool2d([1,3],[1,2])
        self.fc1=nn.Linear(12*13*64,1)
        
    def forward(self,x):
        x = self.pool1(F.relu(self.conv1(x)))
        x = self.pool2(F.relu(self.conv2(x)))
        x = self.pool3(F.relu(self.conv3(x)))
        x = self.pool4(F.relu(self.conv4(x)))
        x=x.view(-1,12*13*64)
        x=self.fc1(x)
        x=x.squeeze();
        return x

The last x.squeeze() is because BCE requires the same dimension for output and label. Otherwise the output will be (30,1) tensor.

Here is all the error message:

RuntimeError                              Traceback (most recent call last)
<ipython-input-210-2933dbf1b675> in <module>
     16         # forward + backward + optimize
     17         outputs = net(inputs)
---> 18         loss = criterion(outputs, labels)
     19         loss.backward()
     20         optimizer.step()

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    487             result = self._slow_forward(*input, **kwargs)
    488         else:
--> 489             result = self.forward(*input, **kwargs)
    490         for hook in self._forward_hooks.values():
    491             hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\torch\nn\modules\loss.py in forward(self, input, target)
    593                                                   self.weight,
    594                                                   pos_weight=self.pos_weight,
--> 595                                                   reduction=self.reduction)
    596 
    597 

~\Anaconda3\lib\site-packages\torch\nn\functional.py in binary_cross_entropy_with_logits(input, target, weight, size_average, reduce, reduction, pos_weight)
   2075         raise ValueError("Target size ({}) must be the same as input size ({})".format(target.size(), input.size()))
   2076 
-> 2077     return torch.binary_cross_entropy_with_logits(input, target, weight, pos_weight, reduction_enum)
   2078 
   2079 

RuntimeError: output with type torch.cuda.LongTensor doesn't match the desired type torch.cuda.FloatTensor

Any comments are highly appreciated! Thank you!

ptrblck · January 17, 2019, 3:53am

The error points to the target.
Convert it to a torch.cuda.FloatTensor using:

labels = labels.float()

before passing it to the criterion.

Jake_Pan · January 17, 2019, 4:12am

Ah, you are right. But just for curiousity, why the label has to be float in this loss function? I mean it is a binary classfication, so one should expect the outcome of such classification is only yes(1) or no(0). Why does it assume the target is float?

Ditlev_Jorgensen · January 17, 2019, 9:05am

It might be because of probabilities? Maybe its 0.9 in yes and 0.1 in no, thus meaning its 90% sure its a yes?

Not sure though
Regards

Jake_Pan · January 17, 2019, 9:27am

Yes, probability can be fraction. But my question is that target is the label of a data point, which is usually labelled by human or some other means. How could a label in supervised learning is shown in probability?

ptrblck · January 18, 2019, 3:21am

Sometimes it might help training your model and is called label smoothing. Have a look at this paper. However, I’m not sure how often this technique is used “in the wild”.

deeksha.aggarwal · July 7, 2020, 12:29pm

Got the same error. Resolved it using this solution.
Thanks @ptrblck

JessicaDuFirst · November 18, 2020, 11:10am

cause the prediciton is float.