After I run CUDA_LAUNCH_BLOCKING=1 python HAN.py
:
/home/quoniammm/anaconda3/lib/python3.6/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
THCudaCheck FAIL file=/pytorch/torch/lib/THC/THCGeneral.c line=70 error=30 : unknown error
Traceback (most recent call last):
File "HAN.py", line 264, in <module>
word_attn.cuda()
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 147, in cuda
return self._apply(lambda t: t.cuda(device_id))
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 118, in _apply
module._apply(fn)
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 124, in _apply
param.data = fn(param.data)
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 147, in <lambda>
return self._apply(lambda t: t.cuda(device_id))
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/_utils.py", line 66, in _cuda
return new_type(self.size()).copy_(self, async)
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 266, in _lazy_new
_lazy_init()
File "/home/quoniammm/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 85, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /pytorch/torch/lib/THC/THCGeneral.c:70
The result is as the same as it in notebook.What is the use of CUDA_LAUNCH_BLOCKING=1
?
I still feel confused about the cuda runtime error
How can I debug it?