Hello. I’m trying to install PyTorch v1.4.0 from source and am experiencing the following error:
[4070/4411] Building CXX object test_api/CMakeFiles/test_api.dir/modules.cpp.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "setup.py", line 743, in <module>
build_deps()
File "setup.py", line 316, in build_deps
cmake=cmake)
File "/home/seokwon/pytorch/tools/build_pytorch_libs.py", line 62, in build_caffe2
cmake.build(my_env)
File "/home/seokwon/pytorch/tools/setup_helpers/cmake.py", line 339, in build
self.run(build_args, my_env)
File "/home/seokwon/pytorch/tools/setup_helpers/cmake.py", line 141, in run
check_call(command, cwd=self.build_dir, env=env)
File "/home/seokwon/anaconda3/envs/seanconda/lib/python3.7/subprocess.py", line 363, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '20']' returned non-zero exit status 1.
I followed all of the steps accordingly specified in the GitHub README file and also made sure that CUDA version was appropriate. Does anybody know how to solve this issue? I’ve tried looking elsewhere online but was unsuccessful. Thanks in advance!
Sorry, but how do I check if I’m running out of memory?
Also on a side note, the server I’m using has two versions of CUDA installed: 9.0 and 10.1. In the /usr/local/ directory there are both cuda-10.1/ and cuda-9.0/, and the cuda file is pointing to /usr/local/cuda-9.0/ (I’m not sure if this is correct terminology). I’m trying to use CUDA 10.1, would this be the cause of the problem? If I install PyTorch using conda install pytorch torchvision cudatoolkit=10.1 -c pytorch then it install fine, but torch.cuda.is_available() returns False.
So I managed to solve my issue, but I haven’t been able to solve this error per se.
I wasn’t aware that the CUDA driver didn’t match CUDA. I assumed that my server administrator took care of it but perhaps there was an error in the process. Regardless, installing PyTorch with the method specified on PyTorch’s homepage now works.
Good to hear you figured out the (binary?) installation issue.
However, that doesn’t solve the build from source failure.
Could you clean the build (python setup.py clean), update all submodules (git submodule update --init --recursive) and rebuild it?
While the build is running, you could use htop to observe the system RAM.
I made a new virtual environment and did as you suggested, but it doesn’t seem like it’s a memory problem. There’s plenty of memory left even when the process is taking place.
I got the same errors when installing pytorch from source for versions 1.6 and 1.7 on Ubuntu 18.04. I installed the nvidia drivers, cuda 10.2 and cudnn 8.0.3 for 2080ti gpus. I followed the instructions to install pytorch from source and still got the ninja error from above also at “random” points of the installation. I didn’t have these problems on my local machine with Ubuntu 20.04 with the same GPU and same versions. Any help is much appreciated.
The previous error was created by a potentially old version of GLIBCXX:
../../lib/libc10.so: error: undefined reference to 'std::thread::_State::~_State()', version 'GLIBCXX_3.4.22'
../../lib/libc10.so: error: undefined reference to 'typeinfo for std::thread::_State', version 'GLIBCXX_3.4.22'
../../lib/libc10.so: error: undefined reference to 'std::runtime_error::runtime_error(char const*)', version 'GLIBCXX_3.4.21'
Could you check the install log for errors?
Note that the installation might continue for some steps and the error might be shown in previous lines.