RuntimeError: cuda runtime error (4) : unspecified launch failure

Neda · April 30, 2019, 2:13pm

I faced this error today when pushing the model to the gpu.


  File "C:\Users\Neda\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 377, in convert
    return t.to(device, dtype if t.is_floating_point() else None, non_blocking)

RuntimeError: cuda runtime error (4) : unspecified launch failure at c:\programdata\miniconda3\conda-bld\pytorch_1533100020551\work\aten\src\thc\generic/THCTensorCopy.cpp:20

pytorch version: 0.4.1
I installed it using anaconda on windows.
my cuda version is 9.2.148
gpu model is Quadro M5000 and print(torch.cuda.get_device_capability(0)) is (5, 2)
I tried on another gpu which is GeForce GTX 1080 Ti (6, 1) but gives another error RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED.

I did a lot of experiments before with the same gpu and was fine.
I also restart my computer as I read in this forum that the gpu doesn’t work too well with computers sleeping, but didn’t resolve the issue.

ArturoDeza · July 2, 2019, 3:45pm

I just encountered the same issue as you today. It happened randomly as I was running some of my experiments. Maybe the batch size you are using is too big? This happened to me when I was using a batch size of 128 for images that were 256x256 of size. Still haven’t fixed this issue though!

ntelo007 · December 15, 2019, 11:15am

I am having the same error:

Traceback (most recent call last):
  File "train_mtl.py", line 472, in <module>
    main()
  File "train_mtl.py", line 463, in main
    train(epoch)
  File "train_mtl.py", line 206, in train
    inputsBGR = Variable(inputsBGR.float().cuda())
RuntimeError: cuda runtime error (4) : unspecified launch failure at c:\new-builder_3\win-wheel\pytorch\aten\src\thc\generic/THCTensorCopy.cpp:20

I am running PyTorch 0.4.1 on Windows 10 with NVIDIA Quadro P3200 with Max-Q Design.

The algorithm that I am trying to train is retrieved from the following repository:

My batch contains about 2000 images.