[Solved] Conda install of soumith (2.0) not working with Docker

OS: Ubuntu 16.04
Package manager: conda
Python: 2.7.13, 3.5, and 3.6
CUDA: 8.0

I’m trying to build a Docker image, installing the latest version of PyTorch with conda, using conda install pytorch torchvision cuda80 -c soumith. I’ve attempted to build the image separate times, with three different version of python (noted above).

This keeps resulting in the same error:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 228, in _error_catcher
    yield
  File "/opt/conda/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 310, in read
    data = self._fp.read(amt)
  File "/opt/conda/lib/python3.5/http/client.py", line 448, in read
    n = self.readinto(b)
  File "/opt/conda/lib/python3.5/http/client.py", line 488, in readinto
    n = self.fp.readinto(b)
  File "/opt/conda/lib/python3.5/socket.py", line 575, in readinto
    return self._sock.recv_into(b)
  File "/opt/conda/lib/python3.5/ssl.py", line 929, in recv_into
    return self.read(nbytes, buffer)
  File "/opt/conda/lib/python3.5/ssl.py", line 791, in read
    return self._sslobj.read(len, buffer)
  File "/opt/conda/lib/python3.5/ssl.py", line 575, in read
    v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/pytorch_conda_env/bin/conda", line 6, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.5/site-packages/conda/cli/main.py", line 120, in main
    exit_code = args_func(args, p)
  File "/opt/conda/lib/python3.5/site-packages/conda/cli/main.py", line 130, in args_func
    exit_code = args.func(args, p)
  File "/opt/conda/lib/python3.5/site-packages/conda/cli/main_install.py", line 78, in execute
    install(args, parser, 'install')
  File "/opt/conda/lib/python3.5/site-packages/conda/cli/install.py", line 407, in install
    execute_actions(actions, index, verbose=not args.quiet)
  File "/opt/conda/lib/python3.5/site-packages/conda/plan.py", line 599, in execute_actions
    inst.execute_instructions(plan, index, verbose)
  File "/opt/conda/lib/python3.5/site-packages/conda/instructions.py", line 135, in execute_instructions
    cmd(state, arg)
  File "/opt/conda/lib/python3.5/site-packages/conda/instructions.py", line 48, in FETCH_CMD
    fetch_pkg(state['index'][arg + '.tar.bz2'])
  File "/opt/conda/lib/python3.5/site-packages/conda/fetch.py", line 338, in fetch_pkg
    download(url, path, session=session, md5=info['md5'], urlstxt=True)
  File "/opt/conda/lib/python3.5/site-packages/conda/fetch.py", line 423, in download
    chunk = resp.raw.read(2**14)
  File "/opt/conda/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 320, in read
    flush_decoder = True
  File "/opt/conda/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/opt/conda/lib/python3.5/site-packages/requests/packages/urllib3/response.py", line 246, in _error_catcher
    raise ProtocolError('Connection broken: %r' % e, e)
requests.packages.urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

I don’t have any problems installing pytorch 0.1.12, though, using conda install pytorch torchvision in a python 2.7.13 conda environment.

Dockerfile:

from nvidia/cuda:8.0-cudnn5-devel
from continuumio/miniconda3

# Create conda env
#RUN conda create -n pytorch_conda_env python=2.7.13
#RUN conda create -n pytorch_conda_env python=3.6
RUN conda create -n pytorch_conda_env python=3.5
RUN /bin/bash -c "source activate pytorch_conda_env \
    && conda install jupyter \
    && conda install tqdm \
    && conda install matplotlib \
    && conda install pytorch torchvision cuda80 -c soumith"

Instead of cuDNN 5, try cuDNN 7 image as base from here:
8.0-cudnn7-runtime-ubuntu16.04

Also make sure your DNS / Firewall are not misbehaving. I had similar errors when using secure / open DNS servers.

2 Likes

Thanks for the suggestion. I incorporated it, but still couldn’t get it working with conda. I switched to pip instead, and now things are working well.

from nvidia/cuda:8.0-cudnn7-runtime-ubuntu16.04

# from digital genius
# https://github.com/DigitalGenius/docker-pytorch/blob/master/Dockerfile

# Pick up some TF dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        curl \
        libfreetype6-dev \
        libpng12-dev \
        libzmq3-dev \
        libssl-dev \
        pkg-config \
        python3 \
        python3-dev \
        rsync \
        software-properties-common \
        unzip \
        && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Setup python3 as default
RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 10

# Install pip for python3
RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
    python get-pip.py && \
    rm get-pip.py

RUN pip --no-cache-dir install \
        numpy \
        scipy \
        sklearn \
        scikit-image \
        matplotlib \
        jupyter \
        bokeh \
        tqdm

# Updated for pytorch 0.2.0
RUN pip install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl
RUN pip install torchvision

Interesting. Can you post output from Docker build using conda install? “–no-install-recommends” is dangerous btw, so you may want to try without that switch.

Glad PIP is working though.