A error when using GPU

Thanks! But I want to know how to solve this problem on Pytorch 1.0.0, CUDA 9.0, RTX 2080. Must change to CUDA 10.0?

I don’t have RTX 2080 cards and chances are that the driver shipped with CUDA 9.0 is not fully compatible with RTX 2080. I installed CUDA 10.1 at first. After that, I downgraded the CUDA version to 10.0 while not changing the driver. Hope this can help you.

1 Like

Hi,

I see the same issue, with pytorch 1.0.1.post2, CUDA10.0, RTX2080Ti. I can run on another GPU (tried TitanV and 1080Ti), but if running on the 2080Ti, with benchmark=True, I get this error message:

THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
Traceback (most recent call last):
  File "", line 330, in <module>
    train(epoch)
  File "", line 173, in train
    stereo_out, theta, right_transformed = model(left,right)
  File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "", line 139, in forward
    right_img_transformed, theta = self.stn(right_img)
  File "", line 127, in stn
    x,theta1 = stn(x, self.theta(x), mode=self.stn_mode)
  File "", line 131, in theta
    xs = self.localization(x)
  File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 320, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:405

Is there already a solution for that?

Thanks, Yotam

I’ve got the same hardware (RTX 2080ti) and this fixed it for me. I had to update pytorch to use CUDA 10.

Thanks a lot. It works. But why does it work?

Thanks very much this works for me! phew!

I’ve got the same error. even after I update CUDA to 10.
I happened to find a way to remove it. Now my code of training works.

  • cuda: 10.0
  • python: 3.7
  • pytorch: 1.0
  • cudnn: 7
  • GPU: 2080ti

however, another problem came along when run with 'with torch.no_grad(), the output are all nans.
anyone know this?

Hi, I still have the problem with Cuda10 and 2080ti. Could you share your solution, please? @ janehu

set torch.backends.cudnn.benchmark = True worked for me

1 Like

thanks, l have met the same error when update pytorch1.0 to 1.1 with RTX2080Ti. Setting cudnn.benchmark = False could help to avoid this error, but in pytorch1.0 cudnn.benchmark = True is no problem.:sweat_smile:

Could you post a small reproducible code snippet and print the PyTorch, CUDA and cudnn version so that we can have a look?

FYI I’m getting this in some venvs and not others both with torch 1.1.0, RTX2080 so looks like it’s environmental / dependency related.

Are you using CUDA10 for the RTX2080?

1 Like

I met the same error

 RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:383

with environment:

pyotrch:  1.1.0
cuda: 9.0.176
GPU: RTX2080 Ti
Driver: 418.67

change the cuda version to

10.0.130

the issue gets solved.

Interesting, I get this error on cuda release 10.2 as well (V10.2.89).

Edit: Got it fixed by following ptrblck’s solution from here.

1 Like

I also met this error.
My gpu is GeForce RTX 2080 Ti.
After I upgrade cuda from cuda-8.0 to cuda-10.1 and pytorch from 0.3.0 to 1.4.0, this error is fixed.

1 Like

This also works for me, many thanks!

Same error here:

THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=47 error=804

CUDA: release 10.1, V10.1.243
pytorch: 1.5.0+cu101
torchvision: 0.6.0+cu101
GPU: 2080 TI
docker: 19.03.8

Tried every solution mentioned above, but nothing worked.

it’s depends on your GPU type here https://en.wikipedia.org/wiki/CUDA
GeForce RTX 2080 was Turing (microarchitecture)
I’ve got the error when using CUDA 9.0 , it’s because my GPU Quadro RTX 5000 was Turing (microarchitecture) also, which is not compatible with CUDA 9.2 below
but ony compatible with CUDA 10 above

The version of your cuda driver, nvcc, pytorch should be the same. and the compatibility should follow this table.
https://docs.nvidia.com/deploy/cuda-compatibility/#use-the-right-compat-package

In my case i have:

  • CUDA Version: 11.2
  • gcc g++ 10.3.0
  • nvcc 11.1, V11.1.105
  • torch.version >>> ‘1.8.0+cu111’
  • torch.version.cuda >>> ‘11.1’
    This setup works fine for me. :smiley: