RuntimeError: multi-target not supported (newbie)

y = df['3']
X = df.drop('3' , axis = 1)

from sklearn.preprocessing import LabelEncoder 
my_label = LabelEncoder()
y = my_label.fit_transform(y)

X = torch.tensor(X.to_numpy() , dtype= torch.float32)
y= torch.tensor(y)

train = TensorDataset(X,y)
train_, val_, test_ = random_split(train,[ 7000  , 700 , 2956])
batch_size = 512
train_dl = DataLoader(train_ ,batch_size , shuffle=True)
val_dl = DataLoader(val_ , batch_size)
test_dl = DataLoader(test_ , batch_size)

output_size = 9
input_size  = X.shape[1]

class RFCLF(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(input_size , 9 )
        
    def forward(self, xb):
        xb = self.linear1(xb)
        return xb
log_reg = RFCLF()
val_loss , train_loss = [] ,[] 
loss_func = F.cross_entropy
lr = 0.084
def trainer(model , epochs , lr , val_dl , train_dl  ,loss_func , min_loss, Opt_func=opt.SGD):
    min_l = min_loss
    Op = Opt_func(model.parameters() , lr)

    for epoch in range(epochs):
        #training
        for xb , yb in train_dl:
            Op.zero_grad()
            yhat = model(xb)
            tloss = loss_func(yhat.view(512, 9) , yb.long())
            tloss.backward()
            Op.step()
            
            train_loss.append(tloss) 
            if tloss < min_l:
                print("{} number of epochs and loss of {}".format(epoch , tloss))
                torch.save(model.state_dict() , 'm14.pth')
            
        #validation
        for xb,yb in val_dl:
            yval = model(xb)
            vloss = loss_func(yval , yb.long())
            val_loss.append(vloss)

IndexError: Target 9 is out of bounds

Earlier you mentioned that total 10 classes are there. Why are you casting output into size of 9? What is the shape of target yb? can you please tell me the shape of y_hat coming out from model and yb originally ? (without any change in shape )
In your code what is shape of yhat in the line for given xb :
yhat = model(xb)

and what is the shape of yb for given xb?
tell me only these two things!

After reading the many suggestions, i encoded the labels from 1 -10 to 0-9.

torch.Size([512])

torch.Size([512, 9])

torch.Size([512])

Seen the issue , model is giving an output of [batch, class prob] and target is giving an output of a 1D tensor
just solved this
used nn.CrossEntropyLoss() instead of F.cross_entropy and passed ignore_index=9, reduction=‘mean’ as arguments before computing the loss

I hope this helps someone

class RFCLF(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(input_features  , 20)
        self.linear2 = nn.Linear(20 , output_size )
        
    def forward(self, xb):
        
        xb = self.linear1(xb)
        xb = self.linear2(xb) 
        return xb```

log_reg = RFCLF()

for inputs, targets in train_dl:
    output = log_reg(inputs)
   # print(targets.view(-1 ,1).shape)
    print('output.shape:',output.shape)
    print('target.shape:' , targets.shape)
    #print('output class prob.shape:', output[1].data)
    
    l = nn.CrossEntropyLoss(ignore_index=9, reduction='mean')
    ll = l(output , targets)
    print(ll)
    break

(output.shape: torch.Size([20, 10])
(target.shape: torch.Size([20])
tensor(2.1998, grad_fn=<NllLossBackward>)

I equally have the same issue, can anyone help, I have posted screenshots of my notebook, please someone help me

I assume you are using nn.NLLLoss as the criterion, since you are applying log_softmax as the last activation function.
Have you had a look at the solutions provided in this thread, i.e. check the target shape and make sure it’s [batch_size] without additional dimensions for a mutli-class classification use case?
If so, could you print the model output shape as well as the target shape?

PS: you can post code snippets by wrapping them into three backticks ``` :wink:

it really woks! but i confused why it woks? :slight_smile:

1 Like

i do like criterion(outputs, torch.max(labels, 1)[1]), it works. However, criterion(outputs, labels.squeeze()), it will cause another problem "RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)". Anyway, Thank you a lot!

Using nn.CrossEntropyLoss or nn.NLLLoss there is no point in creating one-hot encoded targets.

^ I ended up switching to an ordinally encoded label based on this.

However, I still have num_classes==num_neurons in my output layer because:

  • You get softmax (one sigmoid per class) predictions aka probabilities for each class.
  • More neurons allows for more parameters/ edges aka more information coming into the output layer. You’ve probably got many neurons in your last hidden layer and they would struggle to jam info about all of the labels into 1 neuron. It lets the activations spread out?
  • Output activations are closer to 1. Less chance of exploding gradient?

Also, don’t forget that you can write your own loss function. This would allow you to swap in the keras class you rlly want.

The model output tensor is expected to have the shape
[batch_size, nb_classes], so using an output neuron for each class (in a multi-class or multi-label classification use case) is the right approach. The targets should only be passed as class indices (not one-hot encoded) to nn.CrossEntropyLoss or nn.NLLLoss.

1 Like

thanks mate worked for me

This works for me too. Thank you

hi @cerkauskas, can you plz tell me what this torch.max(labels, 1)[1] is actually doing? When I print my target dimension , it is exactly batch size * 1, but still, it keeps throwing this error. Thank you!

Please see here for torch.max:

https://pytorch.org/docs/stable/generated/torch.max.html

Here is some code to give you an intuition for the function and its outputs.

import torch


batch_size = 100
num_choices = 5
dummy_outputs = torch.rand(batch_size, num_choices)

print(torch.max(dummy_outputs, dim=1)[0]) #print the max values along dim=1

print(torch.max(dummy_outputs, dim=1)[1]) #print the max indices along dim=1

You will also need to consider what your loss function needs as an input. What kind of loss function are you using?

Correct Target Tensor Shape for a Single Sample

If you have a single input sample, your target tensor should be of shape (1,) with the single element being the class index.

For example: target = torch.tensor([class_index]) where class_index is the integer representing the class.

Correct Target Tensor Shape for Batch:

For a batch of samples, the target tensor should be a 1D tensor, with each element corresponding to the class index of each sample in the batch.

For example: target = torch.tensor([class_index1, class_index2, …, class_indexN]) for a batch of N samples.