I’m receiving the same error as in this post except the specified solution (rebooting) isn’t resolving the issue in my case.
To diagnose the issue I followed the directions here to see if Pytorch is using the GPU at all and received the error again at the below code:
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.device_count()
4
>>> torch.cuda.device(0)
<torch.cuda.device object at 0x7f49ef3623c8>
>>> torch.cuda.get_device_name(0)
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCTensorRandom.cu line=25 error=30 : unknown error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/philippe/miniconda3/envs/cv-dl/lib/python3.6/site-packages/torch/cuda/__init__.py", line 272, in get_device_name
return get_device_properties(device).name
File "/home/philippe/miniconda3/envs/cv-dl/lib/python3.6/site-packages/torch/cuda/__init__.py", line 290, in get_device_properties
init() # will define _get_device_properties and _CudaDeviceProperties
File "/home/philippe/miniconda3/envs/cv-dl/lib/python3.6/site-packages/torch/cuda/__init__.py", line 143, in init
_lazy_init()
File "/home/philippe/miniconda3/envs/cv-dl/lib/python3.6/site-packages/torch/cuda/__init__.py", line 161, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCTensorRandom.cu:25
>>> torch.cuda.current_device()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/philippe/miniconda3/envs/cv-dl/lib/python3.6/site-packages/torch/cuda/__init__.py", line 332, in current_device
_lazy_init()
File "/home/philippe/miniconda3/envs/cv-dl/lib/python3.6/site-packages/torch/cuda/__init__.py", line 161, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCGeneral.cpp:844
>>>
I’ve tried rebooting, uninstalling and reinstalling CUDA 8 and the corresponding drivers, uninstalling and reinstalling torch, and verifying that my torch version for built for CUDA 8 but nothing seems to resolve the issue.