Building pytorch from source with docker image doesnt include mpi

Hey everyone

I’m trying to build pytorch from source in a docker image so I can include mpi, as described in the documentation.

However, the final result doesnt seem to have mpi enabled.

I used these commands to build:

git clone GitHub - pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration
cd pytorch
sudo make -f docker.Makefile CMAKE_VARS=“USE_MPI=1 MAX_JOBS=15”

But when I test the end result i get this:

root@88ef37db4d2a:/workspace# python
Python 3.11.11 | packaged by conda-forge | (main, Dec 5 2024, 14:17:24) [GCC 13.3.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
print(“MPI support:”, torch.distributed.is_mpi_available())
MPI support: False

And printing out the pytorch info

root@88ef37db4d2a:/workspace# python
Python 3.11.11 | packaged by conda-forge | (main, Dec 5 2024, 14:17:24) [GCC 13.3.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch
torch.config.show()
‘PyTorch built with:\n - GCC 11.4\n - C++ Version: 201703\n - Intel(R) MKL-DNN v3.5.3 (Git Hash 66f0cb9eb66affd2da3bf5f8d897376f04aae6af)\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - NNPACK is enabled\n - CPU capability usage: AVX512\n - Build settings: BUILD_TYPE=Release, COMMIT_SHA=eeb57394f93d720bca498c3fa9d167fc7b9cca46, CUDA_VERSION=12.1, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=1 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.7.0, USE_CUDA=ON, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=OFF, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, \n’

Shows that USE_MPI is off

Why does this happen and how do I solve this? Help would be much appreciated

You can find my image at
https://hub.docker.com/repository/docker/kvmhowest/pytorch/general

Update:

I managed to get things working by manually changing the dockerfile to install openmpi into the conda environment:

Like this:

Manually invoke bash on miniconda script per Miniconda3-latest-Linux-x86_64.sh: 494: [[: not found · Issue #10431 · conda/conda · GitHub

RUN chmod +x ~/miniconda.sh &&
bash ~/miniconda.sh -b -p /opt/conda &&
rm ~/miniconda.sh &&
/opt/conda/bin/conda install -y python=${PYTHON_VERSION} openmpi cmake conda-build pyyaml numpy ipython &&
/opt/conda/bin/python -mpip install -r requirements.txt &&
/opt/conda/bin/conda clean -ya