Hello everyone,
I have been using torch + cuda for almost a year now and just upgraded my gpu from 1050ti to 3060ti.
I am having difficulties transferring tensors-models to gpu, with torch.device(0) or similar methods.
I have noticed that when I type
>>> import torch
>>> torch.cuda.is_available()
True
>>> torch.cuda.get_device_name(0)
u'NVIDIA GeForce RTX 3060 Ti'
there is an u here(Does this mean uncompatible)? Also device count returns this.
>>> torch.cuda.device_count()
1L
Could you describe your issue in more detail please? Do you see a runtime error or what exactly is failing?
The error is that it waits for too long, then I need to terminate with ctrl + z.
>>> device = torch.device('cuda')
>>> device
device(type='cuda')
>>> device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
>>> if device.type == 'cuda':
... print(torch.cuda.get_device_name(0))
... print('Memory Usage:')
... print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
... print('Cached: ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')
...
NVIDIA GeForce RTX 3060 Ti
Memory Usage:
('Allocated:', 0.0, 'GB')
('Cached: ', 0.0, 'GB')
>>> z = torch.tensor([5])
>>> k = z.to(device)
Python gets stuck at the last part, while transferring tensor to cuda device.
Nvidia smi output is like this
Mon Mar 7 11:55:56 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 0% 46C P8 17W / 200W | 649MiB / 8192MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1015 G /usr/lib/xorg/Xorg 182MiB |
| 0 N/A N/A 1236 G /usr/bin/gnome-shell 98MiB |
| 0 N/A N/A 1649 G /usr/lib/firefox/firefox 146MiB |
| 0 N/A N/A 2008 C python 213MiB |
| 0 N/A N/A 2956 G /usr/lib/firefox/firefox 2MiB |
+-----------------------------------------------------------------------------+
Also my torch version is: 1.4.0,
Cuda version : 11.6
Nvidia driver: 510
and os is Ubuntu 18.04
After 10 minutes or so, the tensor is loaded to the device
>>> k
tensor([5], device='cuda:0')
This PyTorch release wouldn’t be compatible with your Ampere device, so update to the latest release and select the CUDA 11.3 or 11.5 runtime. Most likely you are JIT compiling the kernels for your architecture, which takes some time.
Hello,
Thanks for the feedback. Building with the 2nd option cuda 11.3 with conda did work. I am observing better performance now. Also, I have upgrade the device driver to 510. Just a note, the ‘u’ at the start of the device name and ‘1L’ in the device count now is back to the normal(its now 1).
The new nvidia-smi output looks like this
Mon Mar 7 23:37:41 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 On | N/A |
| 0% 47C P8 16W / 200W | 683MiB / 8192MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 986 G /usr/lib/xorg/Xorg 18MiB |
| 0 N/A N/A 1084 G /usr/bin/gnome-shell 70MiB |
| 0 N/A N/A 1361 G /usr/lib/xorg/Xorg 208MiB |
| 0 N/A N/A 1507 G /usr/bin/gnome-shell 71MiB |
| 0 N/A N/A 2290 G /usr/lib/firefox/firefox 301MiB |
| 0 N/A N/A 2456 G /usr/lib/firefox/firefox 2MiB |
| 0 N/A N/A 2594 G /usr/lib/firefox/firefox 2MiB |
| 0 N/A N/A 2697 G /usr/lib/firefox/firefox 2MiB |
+-----------------------------------------------------------------------------+
Tomorrow, I will have the opportunity to run this build with my previously trained network, back here soon.