Builtding aborts when trying to built from source: "ninja: build stopped: subcommand failed."

Hello,

I’m trying to install PyTorch from Source but the installation aborts shortly after it started stating: “ninja: build stopped: subcommand failed.”

Here are the specs and versions I’m using:

hostnamectl:

  • Operating System: Fedora Linux 35 (KDE Plasma)
  • Kernel: Linux 5.16.11-200.fc35.x86_64
  • Architecture: x86-64

gcc --version

  • gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)

cmake -version

  • cmake version 3.22.1

nvidia-smi

  • NVIDIA-SMI 510.47.03
  • Driver Version: 510.47.03
  • CUDA Version: 11.6
  • NVIDIA GeForce GTX-1070-ti

nvcc --version

  • Cuda compilation tools, release 11.6, V11.6.112
  • Build cuda_11.6.r11.6/compiler.30978841_0

(The Version I downloaded)

  • CuDNN Version 8.3.2.44

python --version

  • Python 3.9.7

conda --version

  • conda 4.11.0
    The installation was done inside an conda environment.

The PyTorch version is the latest. It was cloned from git directly before the attempt to built from source today. This is the output I got when running “python setup.py install”

Building wheel torch-1.12.0a0+git1a8bd1a
-- Building version 1.12.0a0+git1a8bd1a
cmake --build . --target install --config Release
[1/2010] Building CXX object third_party/breakpad/CMakeF...kpad.dir/src/client/linux/handler/exception_handler.cc.o
FAILED: third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o
/usr/bin/c++ -DHAVE_A_OUT_H -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -I/home/ipbv/Programme/pytorch/cmake/../third_party/benchmark/include -I/home/ipbv/Programme/pytorch/cmake/../third_party/cudnn_frontend/include -I/home/ipbv/Programme/pytorch/third_party/onnx -I/home/ipbv/Programme/pytorch/build/third_party/onnx -I/home/ipbv/Programme/pytorch/third_party/foxi -I/home/ipbv/Programme/pytorch/build/third_party/foxi -I/home/ipbv/Programme/pytorch/third_party/breakpad/src -I/home/ipbv/Programme/pytorch/third_party/breakpad/src/third_party/linux/include -isystem /home/ipbv/Programme/pytorch/build/third_party/gloo -isystem /home/ipbv/Programme/pytorch/cmake/../third_party/gloo -isystem /home/ipbv/Programme/pytorch/cmake/../third_party/googletest/googlemock/include -isystem /home/ipbv/Programme/pytorch/cmake/../third_party/googletest/googletest/include -isystem /home/ipbv/Programme/pytorch/third_party/protobuf/src -isystem /home/ipbv/.conda/envs/boxinst-source/include -isystem /home/ipbv/Programme/pytorch/third_party/gemmlowp -isystem /home/ipbv/Programme/pytorch/third_party/neon2sse -isystem /home/ipbv/Programme/pytorch/third_party/XNNPACK/include -isystem /home/ipbv/Programme/pytorch/cmake/../third_party/eigen -isystem /usr/local/cuda/include -isystem /home/ipbv/Programme/pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN/include -isystem /home/ipbv/Programme/pytorch/third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DTH_HAVE_THREAD -std=gnu++14 -MD -MT third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o -MF third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o.d -o third_party/breakpad/CMakeFiles/breakpad.dir/src/client/linux/handler/exception_handler.cc.o -c /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc
/home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc: In function ‘void google_breakpad::{anonymous}::InstallAlternateStackLocked()’:
/home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: error: no matching function for call to ‘max(int, long int)’
  141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
      |                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/11/bits/char_traits.h:39,
                 from /usr/include/c++/11/string:40,
                 from /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.h:38,
                 from /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:66:
/usr/include/c++/11/bits/stl_algobase.h:254:5: note: candidate: ‘template<class _Tp> constexpr const _Tp& std::max(const _Tp&, const _Tp&)’
  254 |     max(const _Tp& __a, const _Tp& __b)
      |     ^~~
/usr/include/c++/11/bits/stl_algobase.h:254:5: note:   template argument deduction/substitution failed:
/home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   deduced conflicting types for parameter ‘const _Tp’ (‘int’ and ‘long int’)
  141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
      |                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/11/bits/char_traits.h:39,
                 from /usr/include/c++/11/string:40,
                 from /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.h:38,
                 from /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:66:
/usr/include/c++/11/bits/stl_algobase.h:300:5: note: candidate: ‘template<class _Tp, class _Compare> constexpr const _Tp& std::max(const _Tp&, const _Tp&, _Compare)’
  300 |     max(const _Tp& __a, const _Tp& __b, _Compare __comp)
      |     ^~~
/usr/include/c++/11/bits/stl_algobase.h:300:5: note:   template argument deduction/substitution failed:
/home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   deduced conflicting types for parameter ‘const _Tp’ (‘int’ and ‘long int’)
  141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
      |                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/11/algorithm:62,
                 from /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:85:
/usr/include/c++/11/bits/stl_algo.h:3461:5: note: candidate: ‘template<class _Tp> constexpr _Tp std::max(std::initializer_list<_Tp>)’
 3461 |     max(initializer_list<_Tp> __l)
      |     ^~~
/usr/include/c++/11/bits/stl_algo.h:3461:5: note:   template argument deduction/substitution failed:
/home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   mismatched types ‘std::initializer_list<_Tp>’ and ‘int’
  141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
      |                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/11/algorithm:62,
                 from /home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:85:
/usr/include/c++/11/bits/stl_algo.h:3467:5: note: candidate: ‘template<class _Tp, class _Compare> constexpr _Tp std::max(std::initializer_list<_Tp>, _Compare)’
 3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
      |     ^~~
/usr/include/c++/11/bits/stl_algo.h:3467:5: note:   template argument deduction/substitution failed:
/home/ipbv/Programme/pytorch/third_party/breakpad/src/client/linux/handler/exception_handler.cc:141:49: note:   mismatched types ‘std::initializer_list<_Tp>’ and ‘int’
  141 |   static const unsigned kSigStackSize = std::max(16384, SIGSTKSZ);
      |                                         ~~~~~~~~^~~~~~~~~~~~~~~~~
[10/2010] Building CXX object c10/CMakeFiles/c10.dir/core/TensorImpl.cpp.o
ninja: build stopped: subcommand failed.

Previous attempts did show similiar responses except the first attempt. The first got to step [4547/6599]. The following attempts all stoped after about ten steps. Further attempts reduce the number of steps left to do. Does anyone know what causes this aborts?

Thank you in advance!

It seems you are hitting the issue with GCC 11.2 and could try to apply this workaround.

thank you for your quick response. I will have a look a this.