How to get multi-target NLL in C++? (multi-target not supported at ClassNLLCriterion.c)

markl · October 22, 2019, 4:15pm

In the python API, the NLLLoss is allowed to take a target shape (N, d1, …, dk). However, in the c++ api, the torch::nll_loss will crash with an exception multi-target not supported at C:\w\1\s\windows\pytorch\aten\src\THNN/generic/ClassNLLCriterion.c:22

Is there a multi-target NLL for c++?

Here is my code

#include <torch/script.h>
#include <torch/torch.h>
#include <iostream>

int main(int argc, char** argv) {
  std::cout << "NLL Example" << std::endl;

  constexpr int kBatchSize = 1;
  at::Tensor net_output =
      torch::log_softmax(torch::randn({kBatchSize, 2, 300, 300}), /*dim=*/1);
  std::cout << "Log softmax succeeded" << std::endl;

  at::Tensor ground_truth = torch::ones({kBatchSize, 300, 300}).to(at::kLong);
  std::cout << "Ground truth created" << std::endl;

  try {
    at::Tensor loss = torch::nll_loss(net_output, ground_truth);
  } catch (const std::exception& e) {
    std::cout << "Exception occured\n" << e.what() << std::endl;
    exit(-1);
  }
  std::cout << "Done" << std::endl;
}

With output

NLL Example
Log softmax succeeded
Ground truth created
Exception occured
multi-target not supported at C:\w\1\s\windows\pytorch\aten\src\THNN/generic/ClassNLLCriterion.c:22```

ptrblck · October 22, 2019, 4:27pm

This error is usually thrown, if your target still contains the class dimension:

criterion = nn.NLLLoss()

N, nb_classes = 2, 3
output = torch.randn(N, nb_classes, requires_grad=True)
target = torch.randint(0, nb_classes, (N,))

loss = criterion(output, target) # works

target = torch.randint(0, nb_classes, (N, nb_classes))
loss = criterion(output, target) # fails with same error

For a multi-label classification (a single sample can contain more than a single valid class), you should use nn.BCEWithLogitsLoss (or the C++ equivalent).

markl · October 22, 2019, 4:32pm

Thanks @ptrblck. It looks like you are using the form of the target with dimension (N,). However, I need the form where the target has dimension (N, d1, … dk). More explicitly (N, 300, 300). I am trying to train a semantic segmentation task where each pixel in the image has its own label.

I’m not sure the BCEWithLogitsLoss will work for me, since my target dimension is not the same as my input dimension.

ptrblck · October 22, 2019, 4:36pm

I assume the semantic segmentation is a multi-class segmentation, i.e. each pixel belongs to exactly one class.
If so, this should be possible using these shapes

N, nb_classes, H, W = 2, 3, 4, 4
output = torch.randn(N, nb_classes, H, W, requires_grad=True)
target = torch.randint(0, nb_classes, (N, H, W))

loss = criterion(output, target)

Note that the nb_classes dimension is missing in the target as described in the docs, so make sure to pass these shapes to your criterion.

nn.BCEWithLogitsLoss is most likely not, what you are looking for.

markl · October 22, 2019, 4:40pm

I understand what you are saying, but I can’t see where my code sample is broken. My input is a tensor of dimension (1, 2, 300, 300) (1 batch, 2 classes, 300x300 image). My target (ground_truth) is a tensor (1, 300, 300) filled with 1.

ptrblck · October 22, 2019, 4:42pm

This seems to be indeed the right shape and I remembered I’ve seen this issue before!
Could you check, if nll_loss2d is defined and if so use it instead of nll_loss (I’m currently not on my machine to check it)?

markl · October 22, 2019, 4:47pm

There is! https://pytorch.org/cppdocs/api/function_namespaceat_1a5759854ee24eb9a93e7afd6c2f23aae4.html#exhale-function-namespaceat-1a5759854ee24eb9a93e7afd6c2f23aae4

Unfortunately there is no documentation besides the function declaration.

Hopefully it will work as a drop-in replacement in my code.

Edit: It worked! Well at least it succeeded without crashing. I’m now manually checking the output to see if it does what I think it should do (pixel-wise NLL loss, and then taking the mean)

ptrblck · October 22, 2019, 4:53pm

Awesome!
Yeah, we are working on the parity of the C++ API, so these things might be missing currently.
Fortunately, we have the discussion board so feel free to ask, if you get stuck somewhere

markl · October 22, 2019, 5:09pm

For completeness, here is a version that does not crash and looks like it does the right thing

#include <torch/script.h>
#include <torch/torch.h>
#include <iostream>

int main(int argc, char** argv) {
  std::cout << "NLL Example" << std::endl;

  constexpr int kBatchSize = 1;
  constexpr int kImageDim = 2;
  at::Tensor net_output = torch::log_softmax(
      torch::randn({kBatchSize, 2, kImageDim, kImageDim}), /*dim=*/1);
  std::cout << "Log softmax succeeded" << std::endl;

  at::Tensor ground_truth =
      torch::ones({kBatchSize, kImageDim, kImageDim}).to(at::kLong);
  std::cout << "Ground truth (ones) created" << std::endl;

  float actual_loss = 0;
  try {
    at::Tensor loss = torch::nll_loss2d(net_output, ground_truth);
    actual_loss = loss.item<float>();
  } catch (const std::exception& e) {
    std::cout << "Exception occured\n" << e.what() << std::endl;
    exit(-1);
  }

  std::cout << "Expected loss " << -net_output[0][1].mean().item<float>()
            << std::endl;
  std::cout << "Done with actual loss " << actual_loss << std::endl;
}

It outputs

NLL Example
Log softmax succeeded
Ground truth (ones) created
Expected loss 0.828792
Done with actual loss 0.828792

botelho · January 17, 2020, 12:42am

Hi there,

I have the same semantic segmentation problem that @markl has, but my images are three-dimensional. Therefore, I would need a loss function that takes target tensors of shape (N, d1, d2, d3). There doesn’t seem to be a torch::nll_loss3d() in C++ Pytorch, as far as I can tell. What is your recommendation in this case? Do I just have to wait until libTorch catches up with python? If so, any idea when that will happen?

Thanks in advance!

yf225 · January 17, 2020, 5:58pm

Is there a torch.nll_loss3d() function currently in PyTorch Python API?

botelho · January 17, 2020, 6:05pm

No, but Python’s torch.nn.NLLLoss() accepts tensors of arbitrary ranks – see https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss

yf225 · January 17, 2020, 6:08pm

In C++ frontend, torch::nn::NLLLoss() should be able to do the same

botelho · January 17, 2020, 6:11pm

Thanks for your answer! Is that supported after some given version? Because I’m using 1.2.0 and NLLLoss only accepts rank-2 tensors, as described by @markl up above in this thread. If you try a higher rank tensor, you get the multi-target not supported error.

yf225 · January 17, 2020, 6:35pm

Yes it is supported since the newly released version 1.4.0

botelho · January 17, 2020, 6:47pm

Cool, thank you! I’ll try it out.

botelho · December 17, 2020, 5:56pm

Hi, this still doesn’t seem to work even after 1.7. From the code in aten/src/ATen/native/LossNLL.cpp:

  TORCH_CHECK(
      input.dim() > 0 && input.dim() <= 2, "input tensor should be 1D or 2D");
  TORCH_CHECK(
      target.dim() == 1,
      "1D target tensor expected, multi-target not supported");

This works in python but not in libtorch. Was this feature just forgotten as part of the parity work, or am I missing something?

ptrblck · December 18, 2020, 6:16am

It works for me in 1.7.1 using:

#include <torch/torch.h>
#include <iostream>

int main() {
  auto input = torch::randn({3, 10, 24, 24});
  auto target = torch::randint(0, 10, {3, 24, 24}, torch::kLong);
  torch::nn::NLLLoss criterion;
  auto loss = criterion->forward(input, target);
  std::cout << loss << std::endl;
}

Also, note that there are different implementations, such as LossNLL2d.cpp, which might accept multiple dimensions (I haven’t checked, which function is called in my example though).

botelho · December 18, 2020, 5:05pm

Hi, thanks for the response. Yes, your code works for me too. What doesn’t work is to use the free-function form of the NLL loss:

auto input = torch::randn({3, 10, 24, 24});
auto target = torch::randint(0, 10, {3, 24, 24}, torch::kLong);
auto loss = torch::nll_loss(input, target);

Is there any design reason for this to behave differently from NLLLoss? If not, I suggest it either gets brought up-to-par with the class, or removed altogether, to avoid confusion. Thanks again!