Hi, I just got hands on H1000, could someone guide me on how to correctly build pytorch from source?
Have you taken a look at the build from source instructions: pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration (github.com)?
Note that if you are building from source in order to have H100/sm90 support this is not needed as any PyTorch built with CUDA 11.8 or higher would have it (e.g., the current stable release which can be installed via
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118).
I found this post: Installing CUDA 12.1.1 + PyTorch nightly + Python 3.10 on Ubuntu 22.10 · GitHub which is very helpful, However, it shows
NVIDIA H100 PCIe with CUDA capability sm_90 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_89.
If you want to use the NVIDIA H100 PCIe GPU with PyTorch, please check the instructions at Start Locally | PyTorch
Right, you are getting that error message because that guide is specific to sm89/Ada, as there is
TORCH_CUDA_ARCH_LIST=8.9 in the build command. If you change that to
TORCH_CUDA_ARCH_LIST=9.0, then it would work with sm90/H100. The reason the start locally instructions are linked (same as the install command I posted) is that the prebuilt binaries are compiled with support for a wide range of compute capabilities, including both sm89 and sm90.
awesome thanks a lot!