[resolved] Cuda Runtime Error(30)

When I run the code torch.cuda.is_available(), I meet the error as below:

THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=109 error=30 : unknown error
Traceback (most recent call last):
  File "trainer.py", line 13, in <module>
    if torch.cuda.is_available():
  File "/usr/local/lib/python2.7/dist-packages/torch/cuda/__init__.py", line 30, in is_available
    return torch._C._cuda_getDeviceCount() > 0
RuntimeError: cuda runtime error (30) : unknown error at torch/csrc/cuda/Module.cpp:109

There must be something wrong with your driver. Maybe try rebooting?

2 Likes

OK. I have found the problem. After I update the linux system, the driver become useless. So I will reinstall the driver. Thank you for your reply.

I get this when I put my laptop to sleep while in the middle of training. When I put it to sleep, my script stops and I get this error:

THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generated/../THCReduceAll.cuh line=334 error=4 : unspecified launch failure
Traceback (most recent call last):
  File "trytry.py", line 111, in <module>
    loss = network.loss(prediction, label_batch) + 10*torch.mean(cheat_amount)
  File "trytry.py", line 73, in loss
    union = 1e-5 + prediction.sum() + label.sum()
  File "/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py", line 437, in sum
    return Sum(dim)(self)
  File "/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/reduce.py", line 16, in forward
    return input.new((fn(),))
RuntimeError: cuda runtime error (4) : unspecified launch failure at /b/wheel/pytorch-src/torch/lib/THC/generated/../THCReduceAll.cuh:334

And afterward I get this:

THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/THCGeneral.c line=66 error=30 : unknown error
Traceback (most recent call last):
  File "trytry.py", line 77, in <module>
    network = Net()
  File "trytry.py", line 57, in __init__
    self.squeezenet = models.squeezenet1_1(pretrained=True).features.cuda() 
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 147, in cuda
    return self._apply(lambda t: t.cuda(device_id))
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 118, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 124, in _apply
    param.data = fn(param.data)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 147, in <lambda>
    return self._apply(lambda t: t.cuda(device_id))
  File "/usr/local/lib/python3.5/dist-packages/torch/_utils.py", line 65, in _cuda
    return new_type(self.size()).copy_(self, async)
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 272, in __new__
    _lazy_init()
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 85, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /b/wheel/pytorch-src/torch/lib/THC/THCGeneral.c:66

A reboot has fixed the problem. This is with CUDA 8.0 and an nvidia 1060.

3 Likes

I also have the same issue after the laptop wakes up. I think this is a bug. Tensorflow seems to work fine in such situations.

1 Like

Just had the same failure after wake/sleep for desktop. Pytorch 0.2, ubuntu 16.04

6 Likes

Same here! GTX1050ti Ubuntu16.04, reboot fixes it, but one short sleep then wake breaks it!

Same here. ubuntu 16.04 cuda9, pytorch 0.2

Have anyone solved this problem? Thanks.

This also happens on my system

  • Ubuntu 16.04
  • Nvidia GeForce 940MX
  • PyTorch 0.3.1 running on Python 3.6
  • Cuda 8.0
  • CUDNN 7

Any clues? I don’t see why this thread is marked as resolved, if the solution is to restart the laptop every time.

1 Like

Same problem for me too

Ubuntu 16.04 running on Dell desktop
Nvidia GeForce 1050ti
PyTorch 0.3.1.post2 running on Python 3.6
Cuda 9.1
CUDNN 7.1

Run into the same problem

Ubuntu 16.04
TitanXp
Cuda 9.1
pytorch 0.3.1 running python 2.7

reboot fixes the problem.

1 Like

Same problem some strange stuff after wake up (desktop ubuntu 16.04 cuda 8 1080 gtx)

maybe use sudo can solve this problem.
i reinstall my driver and cuda after my linux system updated, and same problem happened

Same problem here. After laptop goes to sleep and wake up, I get this error after calling torch.cuda.current_device():

RuntimeError: cuda runtime error (30) : unknown error at 
/opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/THCGeneral.cpp:70

Ubuntu 18.04, Pytorch 0.4.1, cuda 9.2

2 Likes

I have hp-1000 laptop without GPU. Now how it is possible to handle this error.
"THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=32 error=30 : unknown error
Traceback (most recent call last): File “train_nli.py”, line 62, in
torch.cuda.set_device(params.gpu_id)
File “/home/farhatullah/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 262, in set_device torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (30) : unknown error at torch/csrc/cuda/Module.cpp:32
"
actually I am training InferSent sentence embedding model. There is any availability of https://github.com/facebookresearch/InferSent/blob/master/train_nli.py this code for CPU??

The solution can be found here. Basically, run the following commands in the terminal:

sudo rmmod nvidia_uvm
sudo rmmod nvidia
sudo modprobe nvidia
sudo modprobe nvidia_uvm
4 Likes

Go to NVIDIA Nsight Options and set ‘Enable Crash Detection And Handling = True’.

Did the trick for me.

This always works for me (Win10, Cuda 10.1, Python 3.7.2, PyTorch 1.0.1, NVIDIA GTX 1050 Ti):

import torch
torch.cuda.current_device()

but this always fails for me:

import torch
torch.cuda.is_available()
torch.cuda.current_device()  # fails here

@Mohamed_Ghadban, how can I access the NVIDIA Nsight Options? Thanks in advance :slight_smile: