I am vectorizing a set of documents using a torch.nn module that is part of the huggingface transformers library and getting an error that asks me to set CUDA_LAUNCH_BLOCKING and TORCH_USE_CUDA_DSA in order to get a coherent stack trace.
I cannot find anywhere where and when exactly these variables must be set in order to enable debugging. Can CUDA_LAUNCH_BLOCKING be set after torch installation but before import?
What value must TORCH_USE_CUDA_DSA be set to and can it also be done post-installation?
Thank you in advance for your help.
These are environment variables, which should be set before you launch your Python script. You could also try to set them inside your code via the
os package, but I would strongly recommend to set them in your terminal via
CUDA_LAUNCH_BLOCKING=1 python script.py args.
Thanks for your reply. Am I understanding correctly that you are saying TORCH_USE_CUDA_DSA can be set after CUDA installation and torch installation? It can be set just before the python process initiates?
Also, should it be set to 1? or true?
TORCH_USE_CUDA_DSA is an env variable used during the build and irrelevant during the runtime. To enable it you would need to build PyTorch from source with this env variable.
Okay, makes sense. Thank you.
Last question: should the env var be set to 1 or some other value?
Both env variables should be set to
1 to enable them.
From my previous post:
CUDA_LAUNCH_BLOCKING=1 python script.py args