This is not repeatable behavior, but I have observed that sometimes after waking up from suspend, torch.cuda.is_available() returns false. If I run nvidia-smi, it detects all GPUs on the system. If I reboot, then torch.cuda.is_available() returns true again.
Is there any way to make torch detect cuda without rebooting?
I see a similar issue and I cannot reset or reload the driver because my Ubuntu Xorg is using it to drive the monitor. 5.11.0-37-generic #41~20.04.2-Ubuntu SMP
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 207... Off | 00000000:01:00.0 On | N/A |
| N/A 35C P8 3W / N/A | 378MiB / 7974MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1007 G /usr/lib/xorg/Xorg 35MiB |
| 0 N/A N/A 2032 G /usr/lib/xorg/Xorg 191MiB |
| 0 N/A N/A 2167 G /usr/bin/gnome-shell 30MiB |
| 0 N/A N/A 17865 G ...AAAAAAAAA= --shared-files 76MiB |
| 0 N/A N/A 22353 G ...AAAAAAAAA= --shared-files 30MiB |
+-----------------------------------------------------------------------------+```
My GPU is also used to visualize the desktop and the linked commands work fine.
However, Iām not familiar with your setup, so a restart might be unavoidable.