Expected input batch_size (1) to match target batch_size (11)

I know this seems to be a common problem but I wasn’t able to find a solution. I’m running a multi-label classification model and having issues with tensor sizing.

My full code looks like this:

from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch

# Instantiating tokenizer and model
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-cased')
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-cased')

# Instantiating quantized model
quantized_model = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

# Forming data tensors
input_ids = torch.tensor(tokenizer.encode(x_train[0], add_special_tokens=True)).unsqueeze(0)
labels = torch.tensor(Y[0]).unsqueeze(0)

# Train model
outputs = quantized_model(input_ids, labels=labels)
loss, logits = outputs[:2]

Which yields the error:

ValueError: Expected input batch_size (1) to match target batch_size (11)

Input_ids looks like:

tensor([[  101,   789,   160,  1766,  1616,  1110,   170,  1205,  7727,  1113,
           170,  2463,  1128,  1336,  1309,  1138,   112,   119, 11882, 11545,
           119,   108, 15710,   108,  3645,   108,  3994,   102]])

with shape:

torch.Size([1, 28])

and labels looks like:

tensor([[0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1]])

with shape:

torch.Size([1, 11])

The size of input_ids will vary as the strings to be encoded vary in size.

I also noticed that when feeding in 5 values of Y to produce 5 labels, it yields the error:

ValueError: Expected input batch_size (1) to match target batch_size (55).

with labels shape:

torch.Size([1, 5, 11])

(Note that I didn’t feed 5 input_ids, which is presumably why input size remains constant)

I’ve tried a few different approaches to getting these to work, but I’m currently at a loss. I’d really appreciate some guidance. Thanks!

Which criterion are you using and could you post the code, which yields the error as well as the output and target shapes?
The error message doesn’t match the provided target shape, so we would need to check, if some processing is done additionally.

Hi there! Thanks for the reply! The first code block in the original message yields the error message directly under it, and this code:

from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch

# Instantiating tokenizer and model
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-cased')
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-cased')

# Instantiating quantized model
quantized_model = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

# Forming data tensors
input_ids = torch.tensor(tokenizer.encode(x_train[0], add_special_tokens=True)).unsqueeze(0)
labels = torch.tensor(Y[:5]).unsqueeze(0)

# Train model
outputs = quantized_model(input_ids, labels=labels)
loss, logits = outputs[:2]

Yields the last error message I sent.

Y is:

array([[0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1],
       [0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
       [1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0],
       [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)

and x_train is:

array(["“Worry is a down payment on a problem you may never have'. \xa0Joyce Meyer.  #motivation #leadership #worry ",
       'Whatever you decide to do make sure it makes you happy. ',"@Max_Kellerman  it also helps that the majority of NFL coaching is inept. Some of Bill O'Brien's play calling was wow, ! #GOPATS ",
       "Accept the challenges so that you can literally even feel the exhilaration of victory.' -- George S. Patton 🐶 ",
       "My roommate: it's okay that we can't spell because we have autocorrect. #terrible #firstworldprobs "],
      dtype=object)

input_ids and labels are of types:
torch.int64 and torch.uint8, respectively

For the first block of code in the original message, the shapes of input_ids and labels are:
torch.Size([1, 28]) and torch.Size([1, 11]), respectively

For the first block of code in the original message, the output for input_ids is:

tensor([[  101,   789,   160,  1766,  1616,  1110,   170,  1205,  7727,  1113,
           170,  2463,  1128,  1336,  1309,  1138,   112,   119, 11882, 11545,
           119,   108, 15710,   108,  3645,   108,  3994,   102]])

and the output for labels is:

tensor([[0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1]], dtype=torch.uint8)

I’m not entirely sure what you mean by criterion that I’m using, so please let me know if there’s anything else I can send you that’d help!

It would seem as though DistilBertForSequenceClassification does not support multi-label classification, which may be why the data isn’t being recognized in the form I’m trying to feed it

That might be the case, as the docs state that labels are expected to have the shape [bathc_size,].
I’m not familiar with the implementation and don’t know, if there is an easy switch to multi-label classification (couldn’t find the information).