Cannot access GPU (Nvidia quadro k5200) even after building from souce

I am trying to use PyTorch on a system with Nvidia Quadro k5200 and am unable to use GPU even after building PyTorch from the source.

Following is my current output after running python -m torch.utils.collect_env

Collecting environment information...
PyTorch version: 1.11.0a0+git4aade95
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.3 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.19.6
Libc version: glibc-2.31

Python version: 3.9.5 (default, Jun  4 2021, 12:28:51)  [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.0-97-generic-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: 11.4.152
GPU models and configuration: GPU 0: Quadro K5200
Nvidia driver version: Could not collect
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.2
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.22.1
[pip3] pytorch-lightning==1.5.8
[pip3] torch==1.11.0a0+git4aade95
[pip3] torchmetrics==0.7.0
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.3.1               h2bc3f7f_2  
[conda] magma-cuda110             2.5.2                         1    pytorch
[conda] magma-cuda113             2.5.2                         1    pytorch
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-include               2022.0.1           h06a4308_117  
[conda] mkl-service               2.4.0            py39h7f8727e_0  
[conda] mkl_fft                   1.3.1            py39hd3c417c_0  
[conda] mkl_random                1.2.2            py39h51133e4_0  
[conda] numpy                     1.22.1                   pypi_0    pypi
[conda] numpy-base                1.21.2           py39h79a1101_0  
[conda] pytorch-lightning         1.5.8                    pypi_0    pypi
[conda] torch                     1.11.0a0+git4aade95           dev_0    <develop>
[conda] torchmetrics              0.7.0                    pypi_0    pypi

Which issue are you seeing? Did the build log show that CUDA was found and did you see nvcc compiling CUDA kernels?

I realized for some reason the USE_CUDA flag was set to 0 so I cleaned things up and tried to build torch again explicitly setting USE_CUDA=1. This time around it failed to build and I got this error.

FAILED: bin/cuda_dlconvertor_test 
: && /usr/bin/c++ -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -rdynamic caffe2/CMakeFiles/cuda_dlconvertor_test.dir/__/aten/src/ATen/test/cuda_dlconvertor_test.cpp.o -o bin/cuda_dlconvertor_test  -Wl,-rpath,/home/mehulr/miniconda3/lib:/usr/local/cuda/lib64:/home/mehulr/arpit/pytorch/build/lib:  /usr/local/cuda/lib64/libcudart.so  lib/libgtest_main.a  -Wl,--no-as-needed,"/home/mehulr/arpit/pytorch/build/lib/libtorch.so" -Wl,--as-needed  -Wl,--no-as-needed,"/home/mehulr/arpit/pytorch/build/lib/libtorch_cpu.so" -Wl,--as-needed  lib/libprotobuf.a  /home/mehulr/miniconda3/lib/libmkl_intel_lp64.so  /home/mehulr/miniconda3/lib/libmkl_gnu_thread.so  /home/mehulr/miniconda3/lib/libmkl_core.so  -fopenmp  /usr/lib/x86_64-linux-gnu/libpthread.so  -lm  /usr/lib/x86_64-linux-gnu/libdl.so  lib/libdnnl.a  -ldl  -Wl,--no-as-needed,"/home/mehulr/arpit/pytorch/build/lib/libtorch_cuda.so" -Wl,--as-needed  lib/libc10_cuda.so  lib/libc10.so  /usr/local/cuda/lib64/libcudart.so  /home/mehulr/miniconda3/lib/libnvToolsExt.so  /usr/local/cuda/lib64/libcufft.so  /usr/local/cuda/lib64/libcurand.so  /usr/local/cuda/lib64/libcublas.so  /usr/lib/x86_64-linux-gnu/libcudnn.so  lib/libgtest.a  -pthread && :
/usr/bin/ld: /usr/local/cuda/lib64/libcublas.so: undefined reference to `cublasLtGetStatusString@libcublasLt.so.11'
/usr/bin/ld: /usr/local/cuda/lib64/libcublas.so: undefined reference to `cublasLtGetStatusName@libcublasLt.so.11'
collect2: error: ld returned 1 exit status
[6372/6765] Building CXX object caffe2/CMakeFiles/op_registration_test.dir/__/aten/src/ATen/core/op_registration/op_registration_test.cpp.o
ninja: build stopped: subcommand failed.

I found out that there was some version issue with my CUDA, so I reinstalled the drivers and cuda-toolkit and everything worked fine :smiley: Took me days to figure it out but I finally did it :+1:.