I know this seems to be a common problem but I wasn’t able to find a solution. I’m running a multi-label classification model and having issues with tensor sizing.
My full code looks like this:
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch
# Instantiating tokenizer and model
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-cased')
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-cased')
# Instantiating quantized model
quantized_model = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)
# Forming data tensors
input_ids = torch.tensor(tokenizer.encode(x_train[0], add_special_tokens=True)).unsqueeze(0)
labels = torch.tensor(Y[0]).unsqueeze(0)
# Train model
outputs = quantized_model(input_ids, labels=labels)
loss, logits = outputs[:2]
Which yields the error:
ValueError: Expected input batch_size (1) to match target batch_size (11)
Input_ids looks like:
tensor([[ 101, 789, 160, 1766, 1616, 1110, 170, 1205, 7727, 1113,
170, 2463, 1128, 1336, 1309, 1138, 112, 119, 11882, 11545,
119, 108, 15710, 108, 3645, 108, 3994, 102]])
with shape:
torch.Size([1, 28])
and labels looks like:
tensor([[0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1]])
with shape:
torch.Size([1, 11])
The size of input_ids will vary as the strings to be encoded vary in size.
I also noticed that when feeding in 5 values of Y to produce 5 labels, it yields the error:
ValueError: Expected input batch_size (1) to match target batch_size (55).
with labels shape:
torch.Size([1, 5, 11])
(Note that I didn’t feed 5 input_ids, which is presumably why input size remains constant)
I’ve tried a few different approaches to getting these to work, but I’m currently at a loss. I’d really appreciate some guidance. Thanks!