Hello,
I am trying to install PyTorch on the AWS EC2 Instance but am not able to access the GPU
EC2 Instance Details:
Instance Name: Deep Learning AMI GPU CUDA 11.4.3 (Amazon Linux 2)
Instance Type: t2.xlarge
NVIDIA driver version: 510.47.03
CUDA version: 11.4
(base) [ec2-user@ip-XXXXX ~]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_Oct_11_21:27:02_PDT_2021
Cuda compilation tools, release 11.4, V11.4.152
Build cuda_11.4.r11.4/compiler.30521435_0
GPU Details:
(base) [ec2-user@ip-XXXXX ~]$ lspci | grep VGA
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
(test) [ec2-user@ip-XXXXX ~]$ python
Python 3.7.13 (default, Mar 29 2022, 02:18:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type âhelpâ, âcopyrightâ, âcreditsâ or âlicenseâ for more information.
import torch
torch.cuda.is_available()
False
Thanks in advance for any helpâŚ
1 Like
How did you install PyTorch?
As no CUDA runtime is available I would guess youâve installed the CPU-only binaries?
I am installing PyTorch using the below command
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
What is torch.version.cuda
returning?
In that case it seems your setup has trouble communicating with the GPU so maybe try to use a plain NVIDIA CUDA docker container, install the binaries there and see if it can find the GPU(s).
Alternatively also try to run any other CUDA application in your current setup and see if the device can be used.
I have not installed CUDA, itâs by default installed by the AWS.
While I am running nvidia-smi I got the response like this.
NVIDIA-SMI has failed because it couldnât communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
This sounds indeed like a setup issue as the driver seems to be in a bad state.
Could you restart the node or lease another one to check if this would solve the issue? Once nvidia-smi
is able to communicate with the driver again, try to run any CUDA sample and then a PyTorch application on the GPU.