Help Needed with Fine-Tuning BERT for Text Classification Using PyTorch

Hi everyone, :smiling_face_with_three_hearts:

I am to PyTorch and transformers, and I’m trying to fine-tune BERT for a text classification project. I’ve loaded BERT and the tokenizer from Hugging Face, prepared my dataset, and written a training loop, but I’m having some issues:

  • I’m getting CUDA out of memory errors even though my dataset isn’t very large. How can I manage memory better during training?

  • My model’s accuracy isn’t improving much. Could someone check my training loop and data preprocessing to see if I’m doing something wrong?

  • I’m not sure about the hyperparameters I’m using (learning rate, batch size, etc.). What settings are recommended for fine-tuning BERT on text classification?
    I also check this : https://discuss.pytorch.org/t/bert-finetuning-for-binary-classification-with-special-tokens-evaluates-badlooker But I have not found any solution.

Any advice or tips would be greatly appreciated. Thanks!

Thanks! :innocent:

You didn’t post any code so unsure how we could help?

1 Like