Can not build pytorch eiether in ubuntu or centos

I can never build it according to instruction: GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration
For ubuntu 18.04 here is the last few lines of build failure (log at bottom is Centos 8 build failure as well)


Get:2 http://ppa.launchpad.net/deadsnakes/ppa/ubuntu bionic/main amd64 python3.9-distutils all 3.9.11-1+bionic1 [190 kB]
Fetched 315 kB in 2s (194 kB/s)              
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package python3.9-lib2to3.
(Reading database ... 14757 files and directories currently installed.)
Preparing to unpack .../python3.9-lib2to3_3.9.11-1+bionic1_all.deb ...
Unpacking python3.9-lib2to3 (3.9.11-1+bionic1) ...
Selecting previously unselected package python3.9-distutils.
Preparing to unpack .../python3.9-distutils_3.9.11-1+bionic1_all.deb ...
Unpacking python3.9-distutils (3.9.11-1+bionic1) ...
Setting up python3.9-lib2to3 (3.9.11-1+bionic1) ...
Setting up python3.9-distutils (3.9.11-1+bionic1) ...
root@ixt-hq-178:/pytorch# python3 setup.py install
Traceback (most recent call last):
  File "/pytorch/setup.py", line 219, in <module>
    from setuptools import setup, Extension, find_packages
  File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 14, in <module>
    from setuptools.dist import Distribution, Feature
  File "/usr/lib/python3/dist-packages/setuptools/dist.py", line 24, in <module>
    from setuptools.depends import Require
  File "/usr/lib/python3/dist-packages/setuptools/depends.py", line 7, in <module>
    from .py33compat import Bytecode
  File "/usr/lib/python3/dist-packages/setuptools/py33compat.py", line 54, in <module>
    unescape = getattr(html, 'unescape', html_parser.HTMLParser().unescape)
AttributeError: 'HTMLParser' object has no attribute 'unescape'
root@ixt-hq-178:/pytorch# sudo apt install python3-distutils
sudo: unable to resolve host ixt-hq-178
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python3-distutils is already the newest version (3.6.9-1~18.04).
python3-distutils set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
root@ixt-hq-178:/pytorch#

For centOS 8.x:


[root@centos pytorch]# !123
python setup.py build --cmake-only
Traceback (most recent call last):
File "setup.py", line 219, in
Traceback (most recent call last):
File "setup.py", line 219, in
from setuptools import setup, Extension, find_packages
File "/usr/local/lib/python3.8/site-packages/setuptools/**init** .py", line 18, in
from setuptools.dist import Distribution
File "/usr/local/lib/python3.8/site-packages/setuptools/dist.py", line 34, in
from setuptools import windows_support
File "/usr/local/lib/python3.8/site-packages/setuptools/windows_support.py", line 2, in
import ctypes
File "/usr/local/lib/python3.8/ctypes/**init** .py", line 7, in
from _ctypes import Union, Structure, Array
ModuleNotFoundError: No module named '_ctypes'

In both cases it seems you are running into issues using older Python packages.
E.g. searching for the first issue, yields this result.

no i did not help, already tried it and it fails.


/root/pt/Python-3.9.10/pytorch/torch/nn/functional.pyi.in -> /root/pt/Python-3.9.10/pytorch/torch/nn/functional.pyi.in skipped
/root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/timeit_template.cpp -> /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/timeit_template.cpp skipped
/root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp -> /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp skipped
/root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp -> /root/pt/Python-3.9.10/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp skipped
/root/pt/Python-3.9.10/pytorch/torch/utils/data/datapipes/datapipe.pyi.in -> /root/pt/Python-3.9.10/pytorch/torch/utils/data/datapipes/datapipe.pyi.in skipped
Traceback (most recent call last):
  File "/root/pt/Python-3.9.10/pytorch/setup.py", line 424, in check_pydep
Building wheel torch-1.12.0a0+git23383b1
-- Building version 1.12.0a0+git23383b1
    importlib.import_module(importname)
  File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'yaml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/pt/Python-3.9.10/pytorch/setup.py", line 906, in <module>
    build_deps()
  File "/root/pt/Python-3.9.10/pytorch/setup.py", line 370, in build_deps
    check_pydep('yaml', 'pyyaml')
  File "/root/pt/Python-3.9.10/pytorch/setup.py", line 426, in check_pydep
    raise RuntimeError(missing_pydep.format(importname=importname, module=module))
RuntimeError: Missing build dependency: Unable to `import yaml`.
Please install it via `conda install pyyaml` or `pip install pyyaml`

The current issue points to:

ModuleNotFoundError: No module named 'yaml'
...
Please install it via `conda install pyyaml` or `pip install pyyaml`

so did you try to install the missing package?

This is already installed. That is why i am questioning here, why it is asking for packages that is already installed.

[root@slurm-0 /]# pip install pyyaml
Collecting pyyaml
Downloading PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (661 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 661.8/661.8 KB 9.8 MB/s eta 0:00:00
Installing collected packages: pyyaml
Successfully installed pyyaml-6.0
WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 12. Virtual Environments and Packages — Python 3.12.0 documentation
[root@slurm-0 /]# pip3 install pyyaml
Requirement already satisfied: pyyaml in /usr/local/lib/python3.9/site-packages (6.0)
WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 12. Virtual Environments and Packages — Python 3.12.0 documentation
[root@slurm-0 /]#

Actually this time, somehow it moves forward with build. Now I am getting following:

(last lines of build)

/root/pt/pytorch/torch/csrc/utils/variadic.cpp → /root/pt/pytorch/torch/csrc/utils/variadic.cpp skipped
/root/pt/pytorch/torch/csrc/utils/variadic.h → /root/pt/pytorch/torch/csrc/utils/variadic.h skipped
/root/pt/pytorch/torch/lib/libshm/alloc_info.h → /root/pt/pytorch/torch/lib/libshm/alloc_info.h skipped
/root/pt/pytorch/torch/lib/libshm/core.cpp → /root/pt/pytorch/torch/lib/libshm/core.cpp skipped
/root/pt/pytorch/torch/lib/libshm/err.h → /root/pt/pytorch/torch/lib/libshm/err.h skipped
/root/pt/pytorch/torch/lib/libshm/libshm.h → /root/pt/pytorch/torch/lib/libshm/libshm.h skipped
/root/pt/pytorch/torch/lib/libshm/manager.cpp → /root/pt/pytorch/torch/lib/libshm/manager.cpp skipped
/root/pt/pytorch/torch/lib/libshm/socket.h → /root/pt/pytorch/torch/lib/libshm/socket.h skipped
/root/pt/pytorch/torch/lib/libshm_windows/core.cpp → /root/pt/pytorch/torch/lib/libshm_windows/core.cpp skipped
/root/pt/pytorch/torch/lib/libshm_windows/libshm.h → /root/pt/pytorch/torch/lib/libshm_windows/libshm.h skipped
/root/pt/pytorch/torch/nn/functional.pyi.in → /root/pt/pytorch/torch/nn/functional.pyi.in skipped
/root/pt/pytorch/torch/utils/benchmark/utils/timeit_template.cpp → /root/pt/pytorch/torch/utils/benchmark/utils/timeit_template.cpp skipped
/root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp → /root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/compat_bindings.cpp skipped
/root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp → /root/pt/pytorch/torch/utils/benchmark/utils/valgrind_wrapper/timer_callgrind_template.cpp skipped
/root/pt/pytorch/torch/utils/data/datapipes/datapipe.pyi.in → /root/pt/pytorch/torch/utils/data/datapipes/datapipe.pyi.in skipped
gmake: Makefile: No such file or directory
gmake: *** No rule to make target ‘Makefile’. Stop.
Building wheel torch-1.12.0a0+gitebeea9e
– Building version 1.12.0a0+gitebeea9e
cmake3 --build . --target install --config Release – -j 56

This is when building with amdgpu
PYTORCH_ROCM_ARCH=gfx90? python setup.py install

Sorry, I’m not deeply familiar with AMD builds, so let’s wait for an AMD expert :slight_smile:

on platform with nvidia gpu, it fails also spectacularly with following error:

root@nonroot-MS-7B22:/build-scripts/pt/pytorch# python setup.py install  | tee build-pt.log
CMake Error: File /usr/local/bin/cmake/Modules/CMakeSystem.cmake.in does not exist.
CMake Error at /usr/local/share/cmake-3.16/Modules/CMakeDetermineSystem.cmake:185 (configure_file):
  configure_file Problem configuring file
Call Stack (most recent call first):
  CMakeLists.txt:27 (project)


CMake Error at /usr/local/share/cmake-3.16/Modules/CMakeDetermineCXXCompiler.cmake:23 (include):
  include could not find load file:

    /usr/local/bin/cmake/Modules/CMakeDetermineCompiler.cmake
Call Stack (most recent call first):
  CMakeLists.txt:27 (project)


CMake Error at /usr/local/share/cmake-3.16/Modules/CMakeDetermineCXXCompiler.cmake:64 (_cmake_find_compiler):
  Unknown CMake command "_cmake_find_compiler".
Call Stack (most recent call first):
  CMakeLists.txt:27 (project)


CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
See also "/build-scripts/pt/pytorch/build/CMakeFiles/CMakeOutput.log".
Building wheel torch-1.12.0a0+git5375b2e
-- Building version 1.12.0a0+git5375b2e
cmake -GNinja -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/build-scripts/pt/pytorch/torch -DCMAKE_PREFIX_PATH=/usr/local/lib/python3.9/site-packages;/anaconda3 -DCMAKE_ROOT=/usr/local/bin/cmake -DPYTHON_EXECUTABLE=/usr/bin/python -DPYTHON_INCLUDE_DIR=/usr/local/include/python3.9 -DPYTHON_LIBRARY=/usr/local/lib/libpython3.9.a -DTORCH_BUILD_VERSION=1.12.0a0+git5375b2e -DUSE_NUMPY=True /build-scripts/pt/pytorch
root@nonroot-MS-7B22:/build-scripts/pt/pytorch#

I don’t know what’s causing this, but cmake ships with this file in this location so it seems you cmake setup might be broken?

naah, other builds using cmake working just fine, so with datapoints i have, i am going to point finger at pytorch itself is broken than cmake. after all i followed exactly what is instruction says under pytorch build instruction which itself seems incomplete at best.