I am joining the wagon.
I am getting the same error as @shamoons on the same example. The fix of CUDA_LAUNCH_BLOCKING=1
and CUDA_VISIBLE_DEVICES=0
did nothing. Running on CPU works well.
This operation succeeds:
>>> a = torch.tensor([1]).cuda()
>>> b = torch.rand([1]).cuda()
>>> c = a + b
>>> print(c)
$ tensor([2], device='cuda:0')
The following throws the original error in this post
>>> l = torch.nn.Linear(1, 1).cuda()
>>> a = torch.tensor([1.]).cuda()
>>> l(a)
My specs:
PyTorch version: 1.8.0
CUDA version: 11.0
Driver version: 450.102.04
NVIDIA-SMI: 450.102.04
GPU: NVIDIA GeForce RTX 2080 SUPER
OS: Ubuntu 20.04.2
Kernel: 5.8.0-44