RuntimeError: multi-target not supported (newbie)

Hello. I have a problem when training CNN.

I use one-hot encoding for classes (I want to have probabilities for each class). My dataset is 256x256 RGB images and batch size is 4, so inputs size is (4, 3, 256, 256). I have 120 classes, so labels size is (4, 120) and outputs size is (4, 120) too. Shapes indeed are correct.

Here is how I train my network:

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-4, momentum=0.9)

running_loss = 0.0

for i, data in enumerate(loader, 0):
    # get the inputs
    labels, inputs, filename = data

    # wrap them in Variable
    inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())

    # zero the parameter gradients
    optimizer.zero_grad()

    # forward + backward + optimize
    outputs = model(inputs)
    print(inputs.size(), labels.size(), outputs.size())

    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

    # print statistics
    running_loss += loss.data[0]

Shapes are correct, but I get error RuntimeError: multi-target not supported at /pytorch/torch/lib/THCUNN/generic/ClassNLLCriterion.cu:19 at line loss = criterion(outputs, labels).

What am I doing wrong?

Thank you very much in advance!

9 Likes

CrossEntropyLoss does not expect a one-hot encoded vector as the target, but class indices:

The input is expected to contain scores for each class.
input has to be a 2D Tensor of size (minibatch, C).
This criterion expects a class index (0 to C-1) as the target for each value of a 1D tensor of size minibatch

51 Likes

Okay, got it.

I tried loss = criterion(torch.max(outputs, 1)[1], torch.max(labels, 1)[1]), to get indices, but it produces KeyError: <class 'torch.cuda.LongTensor'>.
How else I could get indices?

1 Like

Your outputs should keep the size (minibatch, C).

Try this code snippet:

criterion = nn.CrossEntropyLoss()

output = Variable(torch.randn(10, 120).float())
target = Variable(torch.FloatTensor(10).uniform_(0, 120).long())

loss = criterion(output, target)
11 Likes

Great! It works.

I solved it using loss = criterion(outputs, torch.max(labels, 1)[1])

Thank you very much!

31 Likes

Thanks for sharing. This worked for…

Please, what are class indices?!
Are they objects that can be accessed by indices?!

No, they are just values representing a class.
For example:

0 - class0, 
1 - class1,
...
3 Likes

It was confusing at first; So it is the index of the correct class.
Everything is so clear now.

Thank You

Sorry for the ignorance, but what is the point to one hot encoding if you’ll transform into a integer in the end? I was making a CNN with the following encoding

def encode(self):
        """ One of the ways to make the encoding """
        df = pd.read_csv('../input/train.csv')
        unique_classes = pd.unique(df['Id'])
        encoding = dict(enumerate(unique_classes))
        encoding = {value: key for key, value in encoding.items()}
        df = df.replace(encoding)
        return df

With this I had no problems. In the end is the same encoding just a list of enumerated numbers from the classes?

target = Variable(torch.FloatTensor(10).uniform_(0, 120).long())

Using nn.CrossEntropyLoss or nn.NLLLoss there is no point in creating one-hot encoded targets.
However if the targets are already stored as such, you could use the code snippets from this thread to get the class indices.

I’m not completely sure what your code snippet does exactly, but if you get the class index for each corresponding sample, it should work fine.
However, I’m not sure if pd.unique sorts the data or not. Sorting might destroy the data to target mapping, so you should check it.

1 Like

Thanks, I’ll be checking it out

Making it a 2D tensor still did not work for me. CrossEntropyLoss takes a 1D tensor. I had to squeeze the last dimension with target.squeeze(1) so it becomes a 1D tensor of size (batch, ).

4 Likes

Hi,
I solved using the squeeze function. Hope it helps.

loss_func = torch.nn.CrossEntropyLoss() 
loss = loss_func(outputs, labels.squeeze())
1 Like

You can also try this

labels = np.argmax(labels,axis=1)
2 Likes

Thank you so much. This has saved me

Hi @ptrblck:
Could you please help me out, I have tow classes 0 and 1 and I am using nn.CrossEntropyLoss() as criterion. (generally I am using the Siamese network in case you feel there is something different, but it is not important for this question as I get the output as required (batch size, number of classes) and the target label of size (batch size).
I followed all other threads and this one and tried all proposed solutions but still get this error. Can you have a look, please! I have been stuck with this for ages.
the shape of all what I have is printed in the output cell:
Code:

`model.train() # prep model for training
        for i, batch in enumerate(train_loader, 0):
            x1, x2, label, _, _ = batch
            x1, x2, label= x1.to(device), x2.to(device), label.to(device)
            # clear the gradients of all optimized variables
            optimizer.zero_grad()
            # forward pass: compute predicted outputs by passing inputs to the model
            output = model.forward(x1, x2)
            # calculate the loss
            print('output', output)
            print('output.data', output.data)
            print('torch.max(label, 1)[1]', torch.max(label, 1)[1])
            print('target label', label)
            print('torch.max(output, 1)[1]', torch.max(output, 1)[1])
            _, predicted= torch.max(output, 1)
            print('predicted', predicted)
            loss = criterion(output.data, label.long())
            # backward pass: compute gradient of the loss with respect to model parameters
            loss.backward()
            # perform a single optimization step (parameter update)
            optimizer.step()
            # record training loss
            train_losses.append(loss.item())
 `

Output:
training has started

output tensor([[-0.0406, -0.0517],
[-0.0390, -0.0647],
[-0.0264, -0.0449],
[-0.0102, -0.0353],
[-0.0430, -0.0276],
[-0.0200, -0.0663],
[-0.0251, -0.0698],
[-0.0400, -0.0760],
[-0.0667, -0.0711],
[-0.0206, -0.0685],
[-0.0214, -0.0880],
[-0.0150, -0.0579],
[ 0.0153, -0.0565],
[-0.0247, -0.0516],
[-0.0076, -0.0543],
[-0.0179, -0.0716]], device=‘cuda:0’, grad_fn=)
output.data tensor([[-0.0406, -0.0517],
[-0.0390, -0.0647],
[-0.0264, -0.0449],
[-0.0102, -0.0353],
[-0.0430, -0.0276],
[-0.0200, -0.0663],
[-0.0251, -0.0698],
[-0.0400, -0.0760],
[-0.0667, -0.0711],
[-0.0206, -0.0685],
[-0.0214, -0.0880],
[-0.0150, -0.0579],
[ 0.0153, -0.0565],
[-0.0247, -0.0516],
[-0.0076, -0.0543],
[-0.0179, -0.0716]], device=‘cuda:0’)
torch.max(label, 1)[1] tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device=‘cuda:0’)
target label tensor([[1.],
[0.],
[1.],
[1.],
[0.],
[1.],
[0.],
[1.],
[0.],
[0.],
[0.],
[1.],
[1.],
[1.],
[0.],
[1.]], device=‘cuda:0’)
torch.max(output, 1)[1] tensor([0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device=‘cuda:0’)
predicted tensor([0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device=‘cuda:0’)

RuntimeError Traceback (most recent call last)
in ()
2 # early stopping patience; how long to wait after last time validation loss improved.
3 patience = 20
----> 4 model, train_loss, valid_loss = train_model(model, batch_size, patience, n_epochs)

4 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
1869 .format(input.size(0), target.size(0)))
1870 if dim == 2:
-> 1871 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
1872 elif dim == 4:
1873 ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

RuntimeError: multi-target not supported at /pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15

I tried without “.data” as well it did not work.

I look forward to your favourable reply!

It looks like your label tensor has the shape [16, 1] instead of [16].
Could you squeeze the unnecessary dimension using label = label.squeeze(1) and pass it to the criterion.
Also, pass output directly to the criterion, not output.data, and call the model directly via model(x1, x2) instead of model.forward.

1 Like

Thank you very much, you are absolutely the best!!! :smiley: @ptrblck
The only thing I had to change is to make the label long after squeezing. Otherwise everything worked like magic <3

1 Like

Thanks for your great work!