RuntimeError: shape '[-1, 2]' is invalid for input of size 9, please help me figure out this error

Hi Ravindra,
The code that you have posted most probably does not contain the line that’s producing the error RuntimeError: shape ‘[-1, 2]’ is invalid for input of size 9.
You must be trying to reshape/view a tensor somewhere that’s causing this error. The number of elements in your initial tensor (9) aren’t compatible with the shape that you’re trying to achieve (some rows * 2 columns).

Could you please post the relevant part of the code? I can help debug further.

this is my model

from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')

This is my basic text data

texts = ['I loved the movie. It was great!',
         'The food was terrible.',
         'The weather is okay.']
sentiments = ['positive', 'negative', 'neutral']

Tokenize the text samples

encoded_texts = tokenizer(texts, padding = True, truncation = True, return_tensors = 'pt')
input_ids = encoded_texts['input_ids']
decoding = tokenizer.decode(input_ids[2])

Attention Mask
attention_mask = encoded_texts['attention_mask']

sentiment_labels = [sentiments.index(sentiment) for sentiment in sentiments]

import torch.nn as nn
num_classes = len(set(sentiment_labels))
classification_head = nn.Linear(model.config.hidden_size, num_classes)
model.classifier = classification_head

import torch
import torch.optim as optim

optimizer = optim.AdamW(model.parameters(), lr = 2e-5)  # lr = 0.00002
criterion = torch.nn.CrossEntropyLoss()

num_epochs = 3
for epoch in range(num_epochs):
  output = model(input_ids, attention_mask = attention_mask, labels = torch.tensor(sentiment_labels))
  loss = output.loss

The above code is simple but I didn’t able to figure out where I am doing mistake.
Thank you @srishti-git1110

Thanks for the code!
I looked into hugging face’s source code and found that it’s indeed a view operation that’s causing this error. Specifically, it is -

loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))

This line is present in the forward method and hence self (the currently calling object) refers to the model. If you further try to do print(model.num_labels), it indeed gives 2 rather than 3 which explains the 2 in your error - RuntimeError: shape ‘[-1, 2]’ is invalid for input of size 9.

To get rid of this, you basically need to change only one line of code which is:

model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', num_labels=3)