I used docker to build an environment to reproduce the experiment.
The environment is as follows.
The contents of the docker file are as follows.
FROM nvidia/cuda:9.2-cudnn7-devel-ubuntu18.04 RUN apt-get update RUN apt-get install -y python3 python3-pip # install PyTorch == 1.2.0 RUN pip3 install torch==1.2.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html # install Pillow to install torchvision RUN python3 -m pip install --upgrade pip RUN python3 -m pip install --upgrade Pillow # install torchvision == 0.4.0 RUN pip3 install torchvision==0.4.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html RUN apt-get install -y vim WORKDIR /workspace ENV LIBRARY_PATH /usr/local/cuda/lib64/stubs
I was able to build the environment, but it froze when I ran the code. I tried to figure out what was causing it, and realized that .cuda() might be doing something wrong.
So, I checked torch.cuda.is_available() and found that it is True, but .cuda() is not available. I am beginnner for pytorch, so I don’t know the cause of this.
Here is the actual bug trace I did.
This is running in a docker container.
$ cat /usr/local/cuda/version.txt CUDA Version 9.2.148 $ python3 Python 3.6.9 (default, Dec 8 2021, 21:08:43) [GCC 8.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> torch.cuda.is_available() True >>> torch.__version__ '1.2.0+cu92' >>> torch.cuda.device_count() 1 >>> torch.cuda.get_device_name() 'A100-PCIE-40GB' >>> torch.cuda.current_device() 0 >>> torch.version.cuda '9.2.148' >>> import torchvision >>> torchvision.__version__ '0.4.0+cu92' >>> T = torch.tensor([[1,2],[3,4]]) >>> T = T.cuda()
Freeze when “”“T = T.cuda()”"" is executed.