Use Float values for Y in Cross entropy loss

Animesh_Kumar_Paul · August 11, 2023, 5:01pm

X = torch.from_numpy(np.asarray(X, dtype=np.float32)).float()
y = torch.from_numpy(y).float()

Y has values like this:

0.0, 0.06666666666666665, 0.13333333333333336, 0.2, 0.2666666666666667, 0.33333333333333337, 0.4, 0.6, 0.6173913043478261, 0.6347826086956522, 0.6521739130434783, 0.6695652173913043, 0.6869565217391305, 0.7043478260869565, 0.7217391304347827, 0.7391304347826086, 0.7565217391304347, 0.7739130434782608, 0.7913043478260869, 0.808695652173913, 0.8260869565217391, 0.8434782608695652, 0.8608695652173913, 0.8782608695652174, 0.8956521739130434, 0.9130434782608696, 0.9304347826086956, 0.9478260869565217, 0.9652173913043478, 1.0

Using nn.CrossEntropyLoss gives me the error:

RuntimeError: “nll_loss_forward_reduce_cuda_kernel_2d_index” not implemented for ‘Float’

Could anyone please help?

ptrblck · August 11, 2023, 6:50pm

You might need to update your PyTorch version as this works for me:

loss = nn.CrossEntropyLoss()
input = torch.randn(3, 5, requires_grad=True)
target = torch.randn(3, 5).softmax(dim=1)
output = loss(input, target)
output.backward()

Animesh_Kumar_Paul · August 11, 2023, 7:29pm

My pytorch version is 2.0.1, I believe it is the new one.

And, yes, your sample is still working.

But, my main code gives this error:

File ~/anaconda3/envs/xgb3_ray2_pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File ~/anaconda3/envs/xgb3_ray2_pytorch/lib/python3.8/site-packages/torch/nn/modules/loss.py:1174, in CrossEntropyLoss.forward(self, input, target)
   1173 def forward(self, input: Tensor, target: Tensor) -> Tensor:
-> 1174     return F.cross_entropy(input, target, weight=self.weight,
   1175                            ignore_index=self.ignore_index, reduction=self.reduction,
   1176                            label_smoothing=self.label_smoothing)

File ~/anaconda3/envs/xgb3_ray2_pytorch/lib/python3.8/site-packages/torch/nn/functional.py:3029, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   3027 if size_average is not None or reduce is not None:
   3028     reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 3029 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

RuntimeError: "nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Float'

Do you have any suggestion on it?

ptrblck · August 11, 2023, 9:43pm

Could you post a minimal and executable code snippet to reproduce the issue, please?

Animesh_Kumar_Paul · August 11, 2023, 11:13pm

For Binary Classification:

1st Code: target is 1D and long and is working fine.

loss = nn.CrossEntropyLoss()
input = torch.randn(3, 2, requires_grad=True)
target = torch.Tensor([1,0,1]).long()
output = loss(input, target)

2nd Code: target is 1D, but float. Now it is giving errors like RuntimeError: expected scalar type Long but found Float

loss = nn.CrossEntropyLoss()
input = torch.randn(3, 2, requires_grad=True)
target = torch.Tensor([0.5,0.01,0.9]).float()
output = loss(input, target)

In my code, I am using the 2nd one. I think that is why it is giving errors,

Do you have any idea - for two possible classes, why do we need to provide both class probabilities (means 2D tensor) in the loss function, but not needed when the target value type is long?

ptrblck · August 12, 2023, 1:51am

The error is expected since floating point targets should have the same shape as the model outputs as described in the docs and seen in my example.

Because the floating point targets represent a “soft-target” for the multi-class classification loss.
If you are interested purely in a binary classification use nn.BCEWithLogitsLoss.