After installing Pytorch in this way: pip install -U https://download.pytorch.org/whl/cu100/torch-1.0.0-cp36-cp36m-linux_x86_64.whl, the errors will disappear even when you are using 'torch.backends.cudnn.benchmark = True’
Thanks! But I want to know how to solve this problem on Pytorch 1.0.0, CUDA 9.0, RTX 2080. Must change to CUDA 10.0?
I don’t have RTX 2080 cards and chances are that the driver shipped with CUDA 9.0 is not fully compatible with RTX 2080. I installed CUDA 10.1 at first. After that, I downgraded the CUDA version to 10.0 while not changing the driver. Hope this can help you.
I see the same issue, with pytorch 1.0.1.post2, CUDA10.0, RTX2080Ti. I can run on another GPU (tried TitanV and 1080Ti), but if running on the 2080Ti, with benchmark=True, I get this error message:
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument Traceback (most recent call last): File "", line 330, in <module> train(epoch) File "", line 173, in train stereo_out, theta, right_transformed = model(left,right) File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "", line 139, in forward right_img_transformed, theta = self.stn(right_img) File "", line 127, in stn x,theta1 = stn(x, self.theta(x), mode=self.stn_mode) File "", line 131, in theta xs = self.localization(x) File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/yotamg/PycharmProjects/PSMNet/venv3/local/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 320, in forward self.padding, self.dilation, self.groups) RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:405
Is there already a solution for that?
I’ve got the same hardware (RTX 2080ti) and this fixed it for me. I had to update pytorch to use CUDA 10.
Thanks a lot. It works. But why does it work?
Thanks very much this works for me! phew!
I’ve got the same error. even after I update CUDA to 10.
I happened to find a way to remove it. Now my code of training works.
- cuda: 10.0
- python: 3.7
- pytorch: 1.0
- cudnn: 7
- GPU: 2080ti
however, another problem came along when run with 'with torch.no_grad(), the output are all nans.
anyone know this?
Hi, I still have the problem with Cuda10 and 2080ti. Could you share your solution, please? @ janehu
set torch.backends.cudnn.benchmark = True worked for me
thanks, l have met the same error when update pytorch1.0 to 1.1 with RTX2080Ti. Setting cudnn.benchmark = False could help to avoid this error, but in pytorch1.0 cudnn.benchmark = True is no problem.
Could you post a small reproducible code snippet and print the PyTorch, CUDA and cudnn version so that we can have a look?
FYI I’m getting this in some venvs and not others both with torch 1.1.0, RTX2080 so looks like it’s environmental / dependency related.
Are you using CUDA10 for the RTX2080?