I’m running in to this error when training “bert-base-uncased”.
The metric I’m using is
metric = evaluate.combine([“accuracy”, “f1”, “precision”, “recall”])
and the training code is
fold_trainer = Trainer(
My dataset is 700 texts. The code works fine when running on Google Colab (but time consuming), so I turn to work on a server on terminal using virtual env and downloaded all needed packaged. However, I kept running into this problem. How can I solve this? Why would it run on Colab but not on the server?
I have tried CUDA_LAUNCH_BLOCKING=1, but it did not print out anything (even after 7 hours).