RuntimeError: multi-target not supported

Hi, I’m training an RNN for intent prediction. There are 11 actions and 29 slots for each action.
for example:
I’d like 2 tickets to see Zoolander 2 tomorrow at Regal Meridian 16 theater in Seattle at 9:25 PM
request(ticket;moviename=Zoolander 2;date=tomorrow;theater=Regal Meridian 16;city=Seattle;starttime=9:25 PM;numberofpeople=2)

I used TfidfVectorizer to transform the text into vectors. Below is the code for training.


if __name__ == '__main__':
	parser = argparse.ArgumentParser()
	parser.add_argument('--slots', dest='run_mode', type=int, default=0, help='only predict action')

	args = parser.parse_args()
	params = vars(args)

	run_mode = params['run_mode']

	raw = load_file()
	if(run_mode == 1):
		features_numpy, targets_numpy = Data_Preproccess(raw)
	elif(run_mode == 0):
		features_numpy, targets_numpy = Data_Preproccess2(raw)
	# train test split. Size of train data is 80% and size of test data is 20%. 
	features_train, features_test, targets_train, targets_test = train_test_split(features_numpy, targets_numpy, test_size = 0.2, random_state = 42)
	# create feature and targets tensor for train set. As you remember we need variable to accumulate gradients. Therefore first we create tensor, then we will create variable
	featuresTrain = torch.from_numpy(features_train)
	targetsTrain = torch.from_numpy(targets_train).type(torch.LongTensor) # data type is long
	# create feature and targets tensor for test set.
	featuresTest = torch.from_numpy(features_test)
	targetsTest = torch.from_numpy(targets_test).type(torch.LongTensor) # data type is long

	# batch_size, epoch and iteration
	batch_size = 100
	n_iters = 2500
	num_epochs = n_iters / (len(features_train) / batch_size)
	num_epochs = int(num_epochs)

	# Pytorch train and test sets
	train =,targetsTrain)
	test =,targetsTest)

	# data loader
	train_loader =, batch_size = batch_size, shuffle = False)
	test_loader =, batch_size = batch_size, shuffle = False)
	# Create RNN
	input_dim = 4773   # input dimension
	hidden_dim = 100  # hidden layer dimension
	layer_dim = 2     # number of hidden layers
	output_dim = 11*29   # output dimension

	model = RNNModel(input_dim, hidden_dim, layer_dim, output_dim)

	# Cross Entropy Loss 
	error = nn.CrossEntropyLoss()

	# SGD Optimizer
	learning_rate = 0.05
	optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
	seq_dim = 1 
	loss_list = []
	iteration_list = []
	accuracy_list = []
	count = 0
	for epoch in range(num_epochs):
		for i, (images, labels) in enumerate(train_loader):

			train  = Variable(images.view(-1, seq_dim, input_dim))
			labels = Variable(labels)
			# Clear gradients
			# Forward propagation
			outputs = model(train.float())
			# Calculate softmax and cross entropy loss
			loss = error(outputs, labels)
			# Calculating gradients
			# Update parameters
			count += 1
			if count % 250 == 0:
				# Calculate Accuracy         
				correct = 0
				total = 0
				# Iterate through test dataset
				for images, labels in test_loader:
					images = Variable(images.view(-1, seq_dim, input_dim))
					# Forward propagation
					outputs = model(images)
					# Get predictions from the maximum value
					predicted = torch.max(, 1)[1]
					# Total number of labels
					total += labels.size(0)
					correct += (predicted == labels).sum()
				accuracy = 100 * correct / float(total)
				# store loss and iteration
				if count % 500 == 0:
					# Print Loss
					print('Iteration: {}  Loss: {}  Accuracy: {} %'.format(count,[0], accuracy))


Traceback (most recent call last):
  File "", line 85, in <module>
    loss = error(outputs, labels)
  File "/anaconda3/lib/python3.6/site-packages/torch/nn/modules/", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/anaconda3/lib/python3.6/site-packages/torch/nn/modules/", line 862, in forward
    ignore_index=self.ignore_index, reduction=self.reduction)
  File "/anaconda3/lib/python3.6/site-packages/torch/nn/", line 1550, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/anaconda3/lib/python3.6/site-packages/torch/nn/", line 1407, in nll_loss
    return torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: multi-target not supported at /Users/soumith/miniconda2/conda-bld/pytorch_1532623076075/work/aten/src/THNN/generic/ClassNLLCriterion.c:21

I can see that the problem is at loss = error(outputs, labels), since outputs should be 1D. However, I printed its shape and the result is torch.Size([100, 319]). How should I fix this?

This error is usually thrown when you try to pass a target tensor having an invalid shape, e.g. [batch_size, 1]. Could you check it and squeeze() it if necessary?

1 Like

May I ask what the input size should be when I throw my data into the model?

The data you pass into the model depends on the layers you are using.
In case you are using a linear layer, your data shape should be [batch_size, num_features].

nn.CrossEntropyLoss expects logits in the sshape of [batch_size, number_of_classes] and a target tensor of [batch_size] with class indices as its values.


I faced the same problem. The label of my input is a vector of size 8, ex [1,0.2,0.5,0.33,0.666,1,0.25,0.133], and each of my input has a label of size eight with different values, I want to learn this vector for each of my input images, I tried to use CrossEntropy but I got this error RuntimeError: multi-target not supported.
bach_size = 3
target: torch.Size([3, 8])
predict: torch.Size([3, 8])
Do you have any suggestions on what type of loss function I should use to train my model?

nn.CrossEntropyLoss is used for a multi-class classification/segmentation use case and thus uses class indices in the target tensor.
Based on your description your target contains values in the range [0, 1], so you could try to use e.g. nn.BCEWithLogitsLoss or nn.MSELoss. I’m unsure what your exact use case is based on “I want to learn this vector for each of my input images”. If it’s a regression task, use nn.MSELoss, if it’s a “classification” task, you could probably try to use nn.BCEWithLogitsLoss.

Thank you so much. the task is mulilabel multiclass classification. I will try the BCEwithLgitsLoss.

In my task, I have eight classes in the output in which each class for each image in my data set has a different label, so instead of having only zero and one in my target vector to show that image belongs to which classes, for each class I have a score for each class. e.x
for img1_ label = [1, 0.2, 0.5, 0.33, 0.666, 1, 0.25, 0.133], and img2_label = [0.666, 1, 0.5, 0.33, 1, 0.5, 0.25, 0.133]. So for img1 the class 1 and 6 has a score one, and class 3 has score 0.5 and so on.
In this case, is still nn.BCEWithLogitsLoss a good option for loss function? and if I used this loss function do I need to have Sigmoid as an activate function on the last layer?

Yes, you could still use nn.BCEWithLogitsLoss as the criterion.

No, you would have to pass the raw logits to this loss function.
You could use sigmoid + nn.BCELoss, but this would yield worse numerical stability.

Thanks for your help.

I am doing multi-label classification, when using CrossEntropyLoss I get this error:
RuntimeError: 1D target tensor expected, multi-target not supported
and when change it to BCEWithLogitsLoss I get this error:
ValueError: Target size (torch.Size([1, 1])) must be the same as input size (torch.Size([1, 18]))

The label indices look as follow for the first 10 examples:
[[0, 1], [0], [0], [0, 1], [0, 2], [0, 1], [0], [0, 1], [0], [0]]

I have 18 classes. Would you please let me know what I am doing wrong.

Thank you

A multi-label classification is used if each sample can belong to zero, one, or multiple classes.
For this to work, the output tensor of the model as well as the target should have the shape [batch_size, nb_classes]. In your current example of the target tensor it seems that it’s using a variable shape, which won’t work, as it should contain the target values for each class.
Assuming you are working with 18 classes, both tensors are expected to have the shape [batch_size, 18], where the target contains values in [0, 1] for each class.

1 Like