First I had a problem with installing NeMo in my Dockerfile. After a little experimentation, I was able to install it. But a new problem has appeared in that Pytorch does not see my GPU’s (they definitely exist and work properly). I’m using the base image pytorch/pytorch:1.11.0-cuda11.3-cudnn8-runtime. This is the only image I can use. I’m guessing the problem is due to version incompatibility. Is there any solution to this problem?
Dockerfile example:
FROM pytorch/pytorch:1.11.0-cuda11.3-cudnn8-runtime
WORKDIR /app
ARG TIMEZONE=UTC
RUN apt update && apt install tzdata -y
ENV TZ="America/New_York"
RUN apt-get update && \
apt-get install -y git && \
apt-get install -y python3.10 && \
apt-get -y install curl && \
apt-get install -y wget && \
apt-get install -y libsndfile1 ffmpeg && \
apt-get install --no-install-recommends --yes build-essential && \
apt-get install -y module-init-tools
ARG nvidia_binary_version="470.223.02" # there i try other versions
ARG nvidia_binary="NVIDIA-Linux-x86_64-${nvidia_binary_version}.run"
RUN wget -q https://us.download.nvidia.com/XFree86/Linux-x86_64/${nvidia_binary_version}/${nvidia_binary} && chmod +x ${nvidia_binary} &&./${nvidia_binary} --accept-license --ui=none --no-kernel-module --no-questions &&rm -rf ${nvidia_binary}
RUN pip install --upgrade pip
ADD requirements.txt requirements.txt
RUN pip install -r requirements.txt
RUN pip install Cython
RUN pip install nemo_toolkit['all']
# Install fasttext (assuming it requires building)
RUN git clone https://github.com/facebookresearch/fastText.git && \
cd fastText && \
pip install .
ADD . .
ADD download.py .
RUN python3 download.py # there i get error
EXPOSE 8000
CMD python3 -u app.py
download.py:
from service_utils import *
from panns_inference import AudioTagging
checkpoints_asr_model = 'checkpoints/stt_en_conformer_transducer_xxlarge.nemo'
checkpoints_text_PC_model = 'checkpoints/punctuation_en_bert.nemo'
def download_model():
denoiser = get_model_master64('https://dl.fbaipublicfiles.com/adiyoss/denoiser/master64-8a5dfb4bb92753dd.th', pretrained=True)
denoiser = denoiser.to('cuda')
at = AudioTagging(checkpoint_path='checkpoints/ASC.pth', device='cuda')
if __name__ == "__main__":
download_model()
nvidia-smi:
| NVIDIA-SMI 515.48.07 Driver Version: 515.48.07 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTX A4000 On | 00000000:01:00.0 Off | Off |
| 41% 37C P8 13W / 140W | 12719MiB / 16376MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA RTX A5000 On | 00000000:02:00.0 Off | Off |
| 30% 27C P8 15W / 230W | 7017MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA RTX A4000 On | 00000000:2D:00.0 Off | Off |
| 41% 36C P8 14W / 140W | 2MiB / 16376MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA RTX A4000 On | 00000000:41:00.0 Off | Off |
| 69% 87C P2 138W / 140W | 15517MiB / 16376MiB | 97% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 4 NVIDIA RTX A4000 On | 00000000:42:00.0 Off | Off |
| 41% 33C P8 16W / 140W | 2688MiB / 16376MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 5 NVIDIA RTX A4000 On | 00000000:61:00.0 Off | Off |
| 41% 35C P8 16W / 140W | 13554MiB / 16376MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 6 NVIDIA RTX A4000 On | 00000000:62:00.0 Off | Off |
| 41% 40C P8 17W / 140W | 11906MiB / 16376MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
Thanks!