Ninja error when installing PyTorch

Hello. I’m trying to install PyTorch v1.4.0 from source and am experiencing the following error:

[4070/4411] Building CXX object test_api/CMakeFiles/test_api.dir/modules.cpp.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "setup.py", line 743, in <module>
    build_deps()
  File "setup.py", line 316, in build_deps
    cmake=cmake)
  File "/home/seokwon/pytorch/tools/build_pytorch_libs.py", line 62, in build_caffe2
    cmake.build(my_env)
  File "/home/seokwon/pytorch/tools/setup_helpers/cmake.py", line 339, in build
    self.run(build_args, my_env)
  File "/home/seokwon/pytorch/tools/setup_helpers/cmake.py", line 141, in run
    check_call(command, cwd=self.build_dir, env=env)
  File "/home/seokwon/anaconda3/envs/seanconda/lib/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '20']' returned non-zero exit status 1.

I followed all of the steps accordingly specified in the GitHub README file and also made sure that CUDA version was appropriate. Does anybody know how to solve this issue? I’ve tried looking elsewhere online but was unsuccessful. Thanks in advance!

The error message is not very helpful. :confused:
Could you check, if you are running out of memory?

Sorry, but how do I check if I’m running out of memory?

Also on a side note, the server I’m using has two versions of CUDA installed: 9.0 and 10.1. In the /usr/local/ directory there are both cuda-10.1/ and cuda-9.0/, and the cuda file is pointing to /usr/local/cuda-9.0/ (I’m not sure if this is correct terminology). I’m trying to use CUDA 10.1, would this be the cause of the problem? If I install PyTorch using conda install pytorch torchvision cudatoolkit=10.1 -c pytorch then it install fine, but torch.cuda.is_available() returns False.

So I managed to solve my issue, but I haven’t been able to solve this error per se.

I wasn’t aware that the CUDA driver didn’t match CUDA. I assumed that my server administrator took care of it but perhaps there was an error in the process. Regardless, installing PyTorch with the method specified on PyTorch’s homepage now works.

Good to hear you figured out the (binary?) installation issue.

However, that doesn’t solve the build from source failure.
Could you clean the build (python setup.py clean), update all submodules (git submodule update --init --recursive) and rebuild it?

While the build is running, you could use htop to observe the system RAM.

I made a new virtual environment and did as you suggested, but it doesn’t seem like it’s a memory problem. There’s plenty of memory left even when the process is taking place.

Is it always throwing this error at the same step and did a clean build not solve anything?

It’s not always at the same time step (e.g. this time it stopped at [3804/4411]) but the error message is the same.

Do you see any other errors before this step in the log?
Since the build uses multiple workers, the actual error might be further up in the log.

I am facing the same issue when build PyTorch-1.4.0 from source.

Here is the building log: https://gist.github.com/YingleiZhang/fc98c94d493ac39506daacefa0298b38#file-gistfile1-txt-L11

I don’t have this problem when build 1.2.0. Any help is appreciated.

I got the same errors when installing pytorch from source for versions 1.6 and 1.7 on Ubuntu 18.04. I installed the nvidia drivers, cuda 10.2 and cudnn 8.0.3 for 2080ti gpus. I followed the instructions to install pytorch from source and still got the ninja error from above also at “random” points of the installation. I didn’t have these problems on my local machine with Ubuntu 20.04 with the same GPU and same versions. Any help is much appreciated.

The previous error was created by a potentially old version of GLIBCXX:

../../lib/libc10.so: error: undefined reference to 'std::thread::_State::~_State()', version 'GLIBCXX_3.4.22'
../../lib/libc10.so: error: undefined reference to 'typeinfo for std::thread::_State', version 'GLIBCXX_3.4.22'
../../lib/libc10.so: error: undefined reference to 'std::runtime_error::runtime_error(char const*)', version 'GLIBCXX_3.4.21'

Could you check the install log for errors?
Note that the installation might continue for some steps and the error might be shown in previous lines.