Compiling C++ Build in Windows with Cuda

BRev · January 9, 2019, 11:45am

Hi,
I’ve been trying to compile PyTorch on Windows for the last couple of days and have been successful with the CPU Release Build. Unfortunately, it doesn’t work with Cuda 9.0. It compiles, but a simple program crashes as soon as .cuda() is called.
I noticed that the lib folder is missing a few dlls that are present in the downloadable build:
cublas64_90.dll
cudart64_90.dll
cudnn64_7.dll
cufft64_90.dll
cufftw64_90.dll
curand64_90.dll
cusparse64_90.dll
libiomp5md.dll
libiompstubs5md.dll
nvcuda.dll
nvfatbinaryLoader.dll
nvrtc64_90.dll
nvrtc-builtins64_90.dll
nvToolsExt64_1.dll
I added them from the latest build manually, so that the program would startup, but as I said it crashes in calls to cuda.
I’d kindly appreciate it if someone could tell me how to get the cuda build working.

best regards

peterjc123 · January 9, 2019, 3:32pm

Please check if your NVIDIA driver is correctly set up.

BRev · January 10, 2019, 10:05am

Hi,
well, the thing is that it works with the headers and binaries in:
https://download.pytorch.org/libtorch/cu90/libtorch-win-shared-with-deps-latest.zip
so I’m fairly certain that the driver is correctly set up. Also I cannot see any error messages in the console window (but due to the many warnings I cannot see everything).
I’m not sure what I can do to check the drivers any way else.

Is there maybe a documented way to reproduce the zip folder in the link above without a docker container? Or can I get access to the logs so I can compare them to mine?
best regards

peterjc123 · January 12, 2019, 5:13am

According to other user’s report, the libtorch you referenced is not linked against cudnn and it is now fixed in the nightlies. Could you please try again with the nightly one? BTW, may I ask what GPU are you using?

BRev · January 14, 2019, 3:00pm

Hi,
thanks for the reply! I tested the last couple of days various configurations:
Cuda 9.0 vs Cuda10.0
1.0 stable vs. nightly
compiling with the python script vs. running cmake manually
different pcs
Sadly I still didn’t get it to work
I also noticed that the current nightly prebuilt binaries with Cuda 10 do not work either (stable does).

My GPU is MSI NVidia GeForce GTX 1060 with 6GB

best regards

BRev · January 16, 2019, 11:27am

Okay, I finally got it to work with Pytorch Version 1.0 and Releasebuild with Cuda 10.0.
Sadly the master version doesn’t seem to work, which is unfortunate because it seems you committed some changes since 1.0 to make debug builds work. Do you know which commit might be a good version that works in debug mode?

peterjc123 · January 16, 2019, 11:42am

You could try RelWithDebInfo builds first. I tried them in CI and it should work with this PR https://github.com/pytorch/pytorch/pull/16008 merged. The debug version could work with this PR, but you’ll need a newer MSVC (>= 15.8). Newer MSVC won’t work with CUDA 10. So you’ll need both the newer version and the legacy version of MSVC.

BRev · January 16, 2019, 1:19pm

Yes, I already tried RelWithDebInfo and that works. I installed VS just recently (just for pytorch) so I should have the newest version, but does that mean I should change this line (as stated in the readme):
call “%VS150COMNTOOLS%\vcvarsall.bat” x64 -vcvars_ver=14.11
to:
call “%VS150COMNTOOLS%\vcvarsall.bat” x64
while building?

best regards

peterjc123 · January 16, 2019, 1:21pm

set “VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build”
set CMAKE_GENERATOR=Visual Studio 15 2017 Win64
set DISTUTILS_USE_SDK=1
set “CUDAHOSTCXX=%VS140COMNTOOLS%…\VC\bin\amd64\cl.exe” (Point this one to your legacy VS 2015/2017 x64 compiler)
call “%VS150COMNTOOLS%\vcvarsall.bat” x64
python setup.py install

BRev · January 17, 2019, 11:29am

Okay, I tried that last night, but unfortunately it didn’t work. I tried this on 1.0 and a more current commit:
set “VS150COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Auxiliary\Build”
set CMAKE_GENERATOR=Visual Studio 15 2017 Win64
set DISTUTILS_USE_SDK=1
set REL_WITH_DEB_INFO=1
rem set DEBUG=1
set NO_TEST=1
set “CUDAHOSTCXX=C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.11.25503\bin\HostX64\x64\cl.exe”
rem set “CUDAHOSTCXX=C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\cl.exe”
call “%VS150COMNTOOLS%\vcvarsall.bat” x64
python setup.py build > buildoutput.txt

And I tried to use both the older toolset in VS2017 and VS2015 (with the patch linked in the readme).

Everytime I get this error:
C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.16.27023\include\yvals_core.h(298): fatal error C1189: #error: STL1001: Unexpected compiler version, expected MSVC 19.15 or newer. [C:\libs\pytorch\build\caffe2\caffe2_gpu.vcxproj]
CMake Error at caffe2_gpu_generated_THCReduceApplyUtils.cu.obj.RelWithDebInfo.cmake:219 (message):
Error generating
C:/libs/pytorch/build/caffe2/CMakeFiles/caffe2_gpu.dir/__/aten/src/THC/RelWithDebInfo/caffe2_gpu_generated_THCReduceApplyUtils.cu.obj
(if you want to I can upload the entire output, about 3MB)
The problem seems to be that CUDAHOSTCXX is used to compile files which use the standard library, and the std files from the newer compiler are used.
Do you know how this might be fixed?
best regards

peterjc123 · January 17, 2019, 12:21pm

https://blogs.msdn.microsoft.com/vcblog/2017/12/19/c17-progress-in-vs-2017-15-5-and-15-6/ This page introduces a flag _ALLOW_COMPILER_AND_STL_VERSION_MISMATCH. You could try if that works.

BRev · January 17, 2019, 1:23pm

Hm okay. They also say “mixing a newer STL with an older compiler is a recipe for doom”, but I’ll try it out^^

BRev · January 22, 2019, 12:49pm

Alright! I finally got both Debug and Release working! At least the mnist example works.
Thank you peterjc123, especially for your work on PyTorch to make it Windows-compatible!

To make it work on Debug, I had to use
python tools/build_libtorch.py
instead of
python setup.py build
because I couldn’t figure out a way to disable Pythongeneration with setup.py. And with Pythongeneration it wouldn’t find the debug version of python, which is not possible to install in Anaconda.

Also I had to fix a small bug in torch/csrc/byte_order.cpp. Should I make a pullrequest for this?

best regards

peterjc123 · January 22, 2019, 12:51pm

@BRev Sure. Feel free to do that.

BRev · January 27, 2019, 1:49pm

Okay, here it is: