I have a SequenceForClassification model, and after training a pre-trained BERT model on 44 classes, I want to add 20 more classes and train it again. I freeze all the layers except the classification layer before training. However, when I change the classification layer to accommodate the new number of classes, I always get the following error:
I am new to data science, so I might be using some terms incorrectly. Here is the piece of code I am currently using:
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', 44)
model.classifier = nn.Linear(model.bert.config.hidden_size, 66)
with torch.no_grad():
outputs = model(input_ids = b["input_ids"].to(device),
attention_mask = b["attention_mask"].to(device),
labels = b["labels"].to(device))
RuntimeError: shape '[-1, 43]' is invalid for input of size 660
How can I fix this issue?