CUDA not available after following DCGAN tutorial?

Hey, I am really impressed with the intuitiveness of Pytorch’s both Python and C++ apis and want to use it at work where we mainly do C++ development, but I am struggling to get the GAN demo going because of a weird issue.

  1. .Setup : FastAI Paperspace Ubuntu instance with all the latest version of pytorch.
  2. When I open up the Python interperter and run torch.cuda.is_available() -> TRUE
  3. When I do the same thing from a C++ (exactly following this tutorial https://github.com/pytorch/examples/tree/master/cpp/dcgan) I get FALSE for toch::cuda::is_available() :?

my CMakeLists.txt looks like this :

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(train-gan)

find_package(Torch REQUIRED)
find_package(CUDA 10.1 REQUIRED)

add_executable(train-gan train_gan.cpp dis.cpp gen.cpp)

target_link_libraries(train-gan "${TORCH_LIBRARIES}")
target_link_libraries(train-gan "${CUDA_LIBRARIES}")

set_property(TARGET train-gan PROPERTY CXX_STANDARD 11)

the full project you can find on my github -> https://github.com/skalaydzhiyski/cpp-gan

I am new to cmake (as probably visible from the repo) and just want to get torch to use GPU in C++.

Let me know if you need any more information and thanks in advance.

Are you able to run the DCGAN example on the GPU or is it using your CPU?

I just cloned your repo and tried to run it.
It looks like my GPU was detected:

Number of colour channels: 4
Running on device: cuda

However, I get an error after these lines:

terminate called after throwing an instance of 'c10::Error'
  what():  Error opening images file at ./mnist/train-images-idx3-ubyte (read_images at /pytorch/torch/csrc/api/src/data/datasets/mnist.cpp:66)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7fa06a35bb91 in /home/pbialecki/libs/libtorch_nightly/libtorch/lib/libc10.so)

Hey, thank you so much for the quick reply I have literally been biting my nails :smiley:

I always get “Running on device: cpu” and that is on a machine that is giving me TRUE when I run cuda.is_available() from python interpreter AND also is recognized by running nvidia-smi. It seems I have everything setup but the application just doesn’t pick up that I have cuda installed :?

Let me know if you need any screenshots/info/output whatever… I have been struggling to get this running for a week now.

And thanks again fo course.

Can you please share your setup … libraries/install directories/versions anything.

I don’t care to run the DCGAN demo per se, I just need to get C++ Torch Api to find my CUDA. and to run on the gpu.

I have ran multiple Python models on this setup and they all run fine on the gpu with some to no effort from my end.

Sure!

  • CUDA version: 10.1
  • CUDA driver: 418.56
  • TITAN V
  • nvcc in /usr/local/cuda-10.1/bin/nvcc
  • libtorch unzipped in a ~/libs

Does cmake find your CUDA install at all?

Yes, Cmake finds cuda successfully … I think my Cuda driver is 4.10 though, do you think that might be an issue?

I have Nvidia Quadro P5000 and the driver is 4.10… do you think might have something to do with the issue?

Might be the reason.
This table gives you the compatible driver versions.
For CUDA10.1, >=418.39 is recommended.

Could you try to update the driver and run the example again?

image

:X … I am planning on decomissioning my machine and requesting a new one to set up from scratch. Do you mind sharing the resource from which you setup your environment, because it seems the issue is with the linking on the libraries I think…

Thanks again for your time and for the quick responses I am desperate at this point.

image

image

image

that is the only code that I run…

Your Python install is not really related to this issue in libtorch.
E.g. I used my base conda environment without PyTorch installed, and could successfully build the C++ example.

Have you built PyTorch from source before?
If so, did you see any issues?

Maybe Peter Goldsborough (one of the PyTorch core devs) will have any idea about this specific issue. CC @goldsborough

Hey @ptrblck,

Sorry to bother you again with this, but I have just got a completely new and clean machine and want to follow the steps of installing

  1. Conda.
  2. Pytorch from source (as you mentioned)

Do you mind providing me with resource links where I can follow the procedure you did, since I can find 50 different ones online and am nos sure which one of them will work out.

Thanks in advance.

I’m not sure, if I’m the right person to ask as I’m installing everything in a pragmatic (and thus maybe not the best?) way.
Anyway,

  • download the latest Conda package link (Use Python 3.7)
  • Select the NVIDIA driver from “Software & Updates -> Additional Driver” on Ubuntu (I’m using 418)
  • Download and install CUDA

Let me know, if you get stuck somewhere.

Hey,

I have followed EXACTLY what you said but still I don’t get the “cuda” device showing up whilst running my code…

Should I build pytorch from source ?
And what environment variables should I set to make it work ?

Thanks again for your time and for your patience.

Did you install some PyTorch binaries and are you able to create CUDATensors using Python?

Yes I can create and run anything I want with through Python and it is working like charm, but when I run the example GAN C++ project I get false for cuda::is_available()…

That is so bizarre I have tried multiple environments with multiple versions of Cuda / Cudnn / Torch on multiple machines - they ALL work fine with Python installs and not with C++…