Can't build Pytorch using the Dockerfile from the repo

I cloned the Pytorch repo and executed DOCKER_BUILDKIT=1 docker build -t pytorchtest .

Sadly, there were errors:

#26 31.35 See also "/opt/pytorch/build/CMakeFiles/CMakeOutput.log".
#26 31.35 See also "/opt/pytorch/build/CMakeFiles/CMakeError.log".
#26 31.40 Building wheel torch-1.13.0a0+gitd83ca9e
#26 31.40 -- Building version 1.13.0a0+gitd83ca9e
#26 31.40 cmake -DBUILD_PYTHON=True -DBUILD_TEST=True -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/opt/pytorch/torch -DCMAKE_PREFIX_PATH=/opt/conda/lib/python3.8/site-packages;/opt/conda/bin/../ -DNUMPY_INCLUDE_DIR=/opt/conda/lib/python3.8/site-packages/numpy/core/include -DPYTHON_EXECUTABLE=/opt/conda/bin/python -DPYTHON_INCLUDE_DIR=/opt/conda/include/python3.8 -DPYTHON_LIBRARY=/opt/conda/lib/libpython3.8.so.1.0 -DTORCH_BUILD_VERSION=1.13.0a0+gitd83ca9e -DUSE_NUMPY=True /opt/pytorch
------
executor failed running [/bin/sh -c TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0+PTX 8.0" TORCH_NVCC_FLAGS="-Xfatbin -compress-all"     CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"     python setup.py install]: exit code: 1

Am I doing something wrong? I’d like to build it like that in order to have a proper dev container for making changes to .cpp files and rebuilding, etc. I’d like to avoid installing the dependencies directly on the host.

Could you check the install log and post the actual error message, please?

I’m assuming this log is written in the filesystem of the temporary container used to build the image, which means it’s long gone after it exits and I see this error, no?

Hi, I’ve had some progress, but now the error is the following:

#20 30.24 Change Dir: /opt/pytorch/build/CMakeFiles/CMakeTmp
#20 30.24
#20 30.24 Run Build Command(s):/usr/bin/make -f Makefile cmTC_26ede/fast && /usr/bin/make  -f CMakeFiles/cmTC_26ede.dir/build.make CMakeFiles/cmTC_26ede.dir/build
#20 30.24 make[1]: Entering directory '/opt/pytorch/build/CMakeFiles/CMakeTmp'
#20 30.24 Building CXX object CMakeFiles/cmTC_26ede.dir/src.cxx.o
#20 30.24 /usr/bin/c++ -DHAS_WERROR_CAST_FUNCTION_TYPE  -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format  -fPIE   -Werror=cast-function-type -o CMakeFiles/cmTC_26ede.dir/src.cxx.o -c /opt/pytorch/build/CMakeFiles/CMakeTmp/src.cxx
#20 30.24 cc1plus: error: -Werror=cast-function-type: no option -Wcast-function-type
#20 30.24 CMakeFiles/cmTC_26ede.dir/build.make:77: recipe for target 'CMakeFiles/cmTC_26ede.dir/src.cxx.o' failed
#20 30.24 make[1]: *** [CMakeFiles/cmTC_26ede.dir/src.cxx.o] Error 1
#20 30.24 make[1]: Leaving directory '/opt/pytorch/build/CMakeFiles/CMakeTmp'
#20 30.24 Makefile:127: recipe for target 'cmTC_26ede/fast' failed
#20 30.24 make: *** [cmTC_26ede/fast] Error 2
#20 30.24
#20 30.24
#20 30.24 Source file was:
#20 30.24 int main() { return 0; }
#20 DONE 30.4s

#21 exporting to image
#21 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#21 exporting layers
#21 exporting layers 37.2s done
#21 writing image sha256:e83f8332b241e435dfee227578206fe3c75fe92ee1eaddaaffe183b7c0762b1d done
#21 naming to docker.io/library/pytorchchanged done
#21 DONE 37.2

I’ve commented out the last part of the Dockerfile, here’s what I’m using:

...
ARG BASE_IMAGE=ubuntu:18.04
ARG PYTHON_VERSION=3.8

FROM ${BASE_IMAGE} as dev-base
RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        ca-certificates \
        ccache \
        # cmake \
        curl \
        git \
        libjpeg-dev \
        libpng-dev && \
    rm -rf /var/lib/apt/lists/*
RUN /usr/sbin/update-ccache-symlinks
RUN mkdir /opt/ccache && ccache --set-config=cache_dir=/opt/ccache
ENV PATH /opt/conda/bin:$PATH

FROM dev-base as conda
ARG PYTHON_VERSION=3.8
# Automatically set by buildx
ARG TARGETPLATFORM
# translating Docker's TARGETPLATFORM into miniconda arches
RUN MINICONDA_ARCH=x86_64 && \
    curl -fsSL -v -o ~/miniconda.sh -O  "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-${MINICONDA_ARCH}.sh"
COPY requirements.txt .
RUN chmod +x ~/miniconda.sh && \
    ~/miniconda.sh -b -p /opt/conda && \
    rm ~/miniconda.sh && \
    /opt/conda/bin/conda install -y python=${PYTHON_VERSION} cmake conda-build pyyaml numpy ipython && \
    /opt/conda/bin/python -mpip install -r requirements.txt && \
    /opt/conda/bin/conda clean -ya

FROM dev-base as submodule-update
WORKDIR /opt/pytorch
COPY . .
RUN git submodule update --init --recursive --jobs 0

FROM conda as build
WORKDIR /opt/pytorch
COPY --from=conda /opt/conda /opt/conda
COPY --from=submodule-update /opt/pytorch /opt/pytorch
RUN --mount=type=cache,target=/opt/ccache \
    TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0+PTX 8.0" TORCH_NVCC_FLAGS="-Xfatbin -compress-all" \
    CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" \
    python setup.py install || cat /opt/pytorch/build/CMakeFiles/CMakeError.log
cc1plus: error: -Werror=cast-function-type: no option -Wcast-function-type

Your GCC might be too old based on this thread.

1 Like

I guess this Dockerfile is used to build (some) docker images, but don’t know if it needs an update or if env vars need to be overridden, but @seemethere might know who is contributing this this file.

1 Like

Seems like the final issue was with the COPY --from=submodule-update /opt/pytorch /opt/pytorch instruction. Some .bzl files were not getting copied. More precisely they were not getting added to the Docker build context because of a .dockerignore file. I’ve added the following line to the end of the .dockerignore and now it works:

!*.bzl

As far as I understand, this is a bug. These files are committed to the repo, so they should get copied.

1 Like