[resolved] Cuda Runtime Error(30)

(Ycszen) #1

When I run the code torch.cuda.is_available(), I meet the error as below:

THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=109 error=30 : unknown error
Traceback (most recent call last):
  File "trainer.py", line 13, in <module>
    if torch.cuda.is_available():
  File "/usr/local/lib/python2.7/dist-packages/torch/cuda/__init__.py", line 30, in is_available
    return torch._C._cuda_getDeviceCount() > 0
RuntimeError: cuda runtime error (30) : unknown error at torch/csrc/cuda/Module.cpp:109

(Adam Paszke) #2

There must be something wrong with your driver. Maybe try rebooting?

(Ycszen) #3

OK. I have found the problem. After I update the linux system, the driver become useless. So I will reinstall the driver. Thank you for your reply.

(Chris Anderson) #4

I get this when I put my laptop to sleep while in the middle of training. When I put it to sleep, my script stops and I get this error:

THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generated/../THCReduceAll.cuh line=334 error=4 : unspecified launch failure
Traceback (most recent call last):
  File "trytry.py", line 111, in <module>
    loss = network.loss(prediction, label_batch) + 10*torch.mean(cheat_amount)
  File "trytry.py", line 73, in loss
    union = 1e-5 + prediction.sum() + label.sum()
  File "/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py", line 437, in sum
    return Sum(dim)(self)
  File "/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/reduce.py", line 16, in forward
    return input.new((fn(),))
RuntimeError: cuda runtime error (4) : unspecified launch failure at /b/wheel/pytorch-src/torch/lib/THC/generated/../THCReduceAll.cuh:334

And afterward I get this:

THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/THCGeneral.c line=66 error=30 : unknown error
Traceback (most recent call last):
  File "trytry.py", line 77, in <module>
    network = Net()
  File "trytry.py", line 57, in __init__
    self.squeezenet = models.squeezenet1_1(pretrained=True).features.cuda() 
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 147, in cuda
    return self._apply(lambda t: t.cuda(device_id))
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 118, in _apply
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 124, in _apply
    param.data = fn(param.data)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 147, in <lambda>
    return self._apply(lambda t: t.cuda(device_id))
  File "/usr/local/lib/python3.5/dist-packages/torch/_utils.py", line 65, in _cuda
    return new_type(self.size()).copy_(self, async)
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 272, in __new__
  File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 85, in _lazy_init
RuntimeError: cuda runtime error (30) : unknown error at /b/wheel/pytorch-src/torch/lib/THC/THCGeneral.c:66

A reboot has fixed the problem. This is with CUDA 8.0 and an nvidia 1060.

(Kaiyin Zhong) #5

I also have the same issue after the laptop wakes up. I think this is a bug. Tensorflow seems to work fine in such situations.

(Alex Rogozhnikov) #6

Just had the same failure after wake/sleep for desktop. Pytorch 0.2, ubuntu 16.04

(Hannes Vasyura Bathke) #7

Same here! GTX1050ti Ubuntu16.04, reboot fixes it, but one short sleep then wake breaks it!

(Psavine42) #8

Same here. ubuntu 16.04 cuda9, pytorch 0.2

(Ke Bai) #9

Have anyone solved this problem? Thanks.

(Dana Kianfar) #10

This also happens on my system

  • Ubuntu 16.04
  • Nvidia GeForce 940MX
  • PyTorch 0.3.1 running on Python 3.6
  • Cuda 8.0
  • CUDNN 7

Any clues? I don’t see why this thread is marked as resolved, if the solution is to restart the laptop every time.

(Ste Millington) #11

Same problem for me too

Ubuntu 16.04 running on Dell desktop
Nvidia GeForce 1050ti
PyTorch 0.3.1.post2 running on Python 3.6
Cuda 9.1

(Adam Harrison) #12

Run into the same problem

Ubuntu 16.04
Cuda 9.1
pytorch 0.3.1 running python 2.7

(Jimmy Xiaoke Shen) #13

reboot fixes the problem.

(Закиров Марат) #14

Same problem some strange stuff after wake up (desktop ubuntu 16.04 cuda 8 1080 gtx)