I’m running in to this error when training “bert-base-uncased”.
The metric I’m using is
metric = evaluate.combine([“accuracy”, “f1”, “precision”, “recall”])
and the training code is
fold_trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets.select(train_index),
eval_dataset=tokenized_datasets.select(test_index),
compute_metrics=compute_metrics,
)
fold_trainer.train()
My dataset is 700 texts. The code works fine when running on Google Colab (but time consuming), so I turn to work on a server on terminal using virtual env and downloaded all needed packaged. However, I kept running into this problem. How can I solve this? Why would it run on Colab but not on the server?
I have tried CUDA_LAUNCH_BLOCKING=1, but it did not print out anything (even after 7 hours).