A weird pytorch docker problem: /opt/conda/bin disappear after pip install

yisha7 · April 24, 2020, 8:17am

a very simple dockerfile:

FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime
RUN pip install request
ENTRYPOINT bash

build : docker build . -t test
run: docker run -it --rm test bash

when i in the docker container, i can’t use python, and the /opt/conda/bin dir disappear.

ptrblck · April 24, 2020, 8:41am

I just tested the container and it seems to work for me:

ptrblck@...:~$ nvidia-docker run -it --ipc=host pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime
Unable to find image 'pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime' locally
1.5-cuda10.1-cudnn7-runtime: Pulling from pytorch/pytorch
...
root@c083513e9eba:/workspace# python
Python 3.7.7 (default, Mar 23 2020, 22:36:06)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> x = torch.randn(1, device='cuda')
>>> x
tensor([-0.5422], device='cuda:0')

yisha7 · April 24, 2020, 9:40am

thanks. yes, the orignal docker image works fine.

but when i build a new image:

FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime
RUN pip install request
ENTRYPOINT bash

and run the new image

the new image can’t use python any more.

ptrblck · April 25, 2020, 12:21am

I cannot reproduce the issue.

My Dockerfile:

FROM pytorch/pytorch:1.5-cuda10.1-cudnn7-runtime
RUN pip install request
ENTRYPOINT bash

Building the container via:

docker build -t tmp_build .

Running via:

nvidia-docker run -it --ipc=host tmp_build

Inside the container:

root@...:/workspace# python -c "import torch; print(torch.randn(1, device='cuda'))"
tensor([0.4954], device='cuda:0')

Could you check, what the difference to your workflow might be?

yisha7 · April 26, 2020, 1:23am

the docker version is 18.09.8, i change to a server of docker version 19, and all things works fine now .