Hi everyone,
I am to PyTorch and transformers, and I’m trying to fine-tune BERT for a text classification project. I’ve loaded BERT and the tokenizer from Hugging Face, prepared my dataset, and written a training loop, but I’m having some issues:
-
I’m getting CUDA out of memory errors even though my dataset isn’t very large. How can I manage memory better during training?
-
My model’s accuracy isn’t improving much. Could someone check my training loop and data preprocessing to see if I’m doing something wrong?
-
I’m not sure about the hyperparameters I’m using (learning rate, batch size, etc.). What settings are recommended for fine-tuning BERT on text classification?
I also check this : https://discuss.pytorch.org/t/bert-finetuning-for-binary-classification-with-special-tokens-evaluates-badlooker But I have not found any solution.
Any advice or tips would be greatly appreciated. Thanks!
Thanks!