Unable to use GPU

Chris_H · November 2, 2020, 12:38pm

I’m trying to set up PyTorch 1.7.0 on a Windows 10 machine with 2 GPUs (2080ti) and CUDA 10.2.

It installs correctly, and at first everything looks ok - torch.cuda.is_available() returns True, device_count() returns 2, get_device_name() returns ‘GeForce RTX 2080 Ti’, and get_device_properties().total_memory shows 11GB for each card.

However, if I create a tensor of size 1 and try to put it on either GPU I get ‘RuntimeError: CUDA error: out of memory’. If I try to put a tensor on GPU 0 after trying to put one on GPU 1, the error changes to ‘RuntimeError: CUDA error: all CUDA-capable devices are busy or unavailable’

I’ve noticed that the first time I call any CUDA function (including is_available) the display goes blank for about 10 to 15 seconds. Windows event log shows that the Nvidia driver crashed and was restarted during this time. is_available returns True when this happens, and subsequent calls do not cause this driver crash.

I have tried different versions of PyTorch and Cuda with similar results, and I have updated Windows and Nvidia drivers. I have another machine with identical hardware also running Windows 10 that can run PyTorch with no problems.

Any ideas?

Chris_H · November 4, 2020, 11:53am