RuntimeError: cuda runtime error (4) : unspecified launch failure

I faced this error today when pushing the model to the gpu.

  File "C:\Users\Neda\Anaconda3\lib\site-packages\torch\nn\modules\", line 377, in convert
    return, dtype if t.is_floating_point() else None, non_blocking)

RuntimeError: cuda runtime error (4) : unspecified launch failure at c:\programdata\miniconda3\conda-bld\pytorch_1533100020551\work\aten\src\thc\generic/THCTensorCopy.cpp:20

pytorch version: 0.4.1
I installed it using anaconda on windows.
my cuda version is 9.2.148
gpu model is Quadro M5000 and print(torch.cuda.get_device_capability(0)) is (5, 2)
I tried on another gpu which is GeForce GTX 1080 Ti (6, 1) but gives another error RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED.

I did a lot of experiments before with the same gpu and was fine.
I also restart my computer as I read in this forum that the gpu doesn’t work too well with computers sleeping, but didn’t resolve the issue.

1 Like

I just encountered the same issue as you today. It happened randomly as I was running some of my experiments. Maybe the batch size you are using is too big? This happened to me when I was using a batch size of 128 for images that were 256x256 of size. Still haven’t fixed this issue though!

I am having the same error:

Traceback (most recent call last):
  File "", line 472, in <module>
  File "", line 463, in main
  File "", line 206, in train
    inputsBGR = Variable(inputsBGR.float().cuda())
RuntimeError: cuda runtime error (4) : unspecified launch failure at c:\new-builder_3\win-wheel\pytorch\aten\src\thc\generic/THCTensorCopy.cpp:20

I am running PyTorch 0.4.1 on Windows 10 with NVIDIA Quadro P3200 with Max-Q Design.

The algorithm that I am trying to train is retrieved from the following repository:

My batch contains about 2000 images.