import torch
import torch.optim as optim
optimizer = optim.AdamW(model.parameters(), lr = 2e-5) # lr = 0.00002
criterion = torch.nn.CrossEntropyLoss()
num_epochs = 4
for epoch in range(num_epochs):
optimizer.zero_grad()
output = model(input_ids, attention_mask = attention_mask, labels = torch.tensor(sentiment_labels))
loss = output.loss
loss.backward()
optimizer.step()
My sentiment lables is 0, 1, 2 with respect to positive, negative, and neutral
sentiment_labels = [sentiments.index(sentiment) for sentiment in sentiments]
print(sentiment_labels)
print(sentiments)
Hi Ravindra,
The code that you have posted most probably does not contain the line that’s producing the error RuntimeError: shape ‘[-1, 2]’ is invalid for input of size 9
.
You must be trying to reshape/view a tensor somewhere that’s causing this error. The number of elements in your initial tensor (9) aren’t compatible with the shape that you’re trying to achieve (some rows * 2 columns).
Could you please post the relevant part of the code? I can help debug further.
this is my model
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
This is my basic text data
texts = ['I loved the movie. It was great!',
'The food was terrible.',
'The weather is okay.']
sentiments = ['positive', 'negative', 'neutral']
Tokenize the text samples
encoded_texts = tokenizer(texts, padding = True, truncation = True, return_tensors = 'pt')
input_ids = encoded_texts['input_ids']
decoding = tokenizer.decode(input_ids[2])
Attention Mask
attention_mask = encoded_texts['attention_mask']
sentiment_labels = [sentiments.index(sentiment) for sentiment in sentiments]
import torch.nn as nn
num_classes = len(set(sentiment_labels))
classification_head = nn.Linear(model.config.hidden_size, num_classes)
model.classifier = classification_head
import torch
import torch.optim as optim
optimizer = optim.AdamW(model.parameters(), lr = 2e-5) # lr = 0.00002
criterion = torch.nn.CrossEntropyLoss()
num_epochs = 3
for epoch in range(num_epochs):
optimizer.zero_grad()
output = model(input_ids, attention_mask = attention_mask, labels = torch.tensor(sentiment_labels))
loss = output.loss
loss.backward()
optimizer.step()
The above code is simple but I didn’t able to figure out where I am doing mistake.
Thank you @srishti-git1110
Thanks for the code!
I looked into hugging face’s source code and found that it’s indeed a view
operation that’s causing this error. Specifically, it is -
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
This line is present in the forward
method and hence self
(the currently calling object) refers to the model
. If you further try to do print(model.num_labels)
, it indeed gives 2 rather than 3 which explains the 2 in your error - RuntimeError: shape ‘[-1, 2]’ is invalid for input of size 9.
To get rid of this, you basically need to change only one line of code which is:
model = DistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased', num_labels=3)