CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start

Could you please help me with this github issue?

Could you please suggest how I should fix this problem?

(torchenc) mona@goku:~$ python test_torch_encoding.py 
/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0
Traceback (most recent call last):
  File "test_torch_encoding.py", line 5, in <module>
    model = encoding.models.get_model('DeepLab_ResNeSt269_PContext', pretrained=True).cuda()
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 463, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 359, in _apply
    module._apply(fn)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 381, in _apply
    param_applied = fn(param)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 463, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init
    torch._C._cuda_init()
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
(torchenc) mona@goku:~$ echo $CUDA_VISIBLE_DEVICES

(torchenc) mona@goku:~$ nvidia-smi
Sat Jan 30 00:44:47 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 165...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P8     3W /  N/A |   1695MiB /  3911MiB |      7%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1155      G   /usr/lib/xorg/Xorg                133MiB |
|    0   N/A  N/A      1773      G   /usr/lib/xorg/Xorg               1124MiB |
|    0   N/A  N/A      1952      G   /usr/bin/gnome-shell              130MiB |
|    0   N/A  N/A      2329      G   ...gAAAAAAAAA --shared-files      167MiB |
|    0   N/A  N/A      2735      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A      2923      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3204      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3324      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3456      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3501      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3572      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      3623      G   /usr/lib/firefox/firefox            3MiB |
|    0   N/A  N/A     21024      G   ...f_3837.log --shared-files       99MiB |
+-----------------------------------------------------------------------------+
(torchenc) mona@goku:~$ 

The code as in your Website:

import torch
import encoding

# Get the model
model = encoding.models.get_model('DeepLab_ResNeSt269_PContext', pretrained=True).cuda()
model.eval()

# Prepare the image
url = 'https://github.com/zhanghang1989/image-data/blob/master/' + \
      'encoding/segmentation/pcontext/2010_001829_org.jpg?raw=true'
filename = 'example.jpg'
img = encoding.utils.load_image(encoding.utils.download(url, filename)).cuda().unsqueeze(0)

# Make prediction
output = model.evaluate(img)
predict = torch.max(output, 1)[1].cpu().numpy() + 1

# Get color pallete for visualization
mask = encoding.utils.get_mask_pallete(predict, 'pascal_voc')
mask.save('output.png')

Is your pytorch version compatible with cuda 11.2

1 Like

If you’ve recently updated some drivers (such as the NVIDIA driver or your local CUDA toolkit), make sure to restart the machine as otherwise the devices might not be detected properly.

4 Likes

thanks a lot @ptrblck you are right due to some specific package installation for OpenGL I had to update Ubuntu.

So, given Pytorch Encoding is released by same team as PyTorch, I was wondering if you know how I could change the batch_size?

(torchenc) mona@goku:~$ python test_torch_encoding.py 
Traceback (most recent call last):
  File "test_torch_encoding.py", line 15, in <module>
    output = model.evaluate(img)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/sseg/base.py", line 101, in evaluate
    pred = self.forward(x)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/sseg/deeplab.py", line 47, in forward
    c1, c2, c3, c4 = self.base_forward(x)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/sseg/base.py", line 96, in base_forward
    c3 = self.pretrained.layer3(c2)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/backbone/resnet.py", line 106, in forward
    out = self.bn3(out)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 131, in forward
    return F.batch_norm(
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/functional.py", line 2056, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 12.00 MiB (GPU 0; 3.82 GiB total capacity; 1.91 GiB already allocated; 13.69 MiB free; 2.00 GiB reserved in total by PyTorch)

Even if I pass batch_size of 1, I still get an error:

(torchenc) mona@goku:~$ python test_torch_encoding.py  --batch_size 1
Traceback (most recent call last):
  File "test_torch_encoding.py", line 15, in <module>
    output = model.evaluate(img)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/sseg/base.py", line 101, in evaluate
    pred = self.forward(x)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/sseg/deeplab.py", line 47, in forward
    c1, c2, c3, c4 = self.base_forward(x)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/sseg/base.py", line 96, in base_forward
    c3 = self.pretrained.layer3(c2)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/container.py", line 117, in forward
    input = module(input)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch_encoding-1.2.2b20210130-py3.8-linux-x86_64.egg/encoding/models/backbone/resnet.py", line 106, in forward
    out = self.bn3(out)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/modules/batchnorm.py", line 131, in forward
    return F.batch_norm(
  File "/home/mona/venv/torchenc/lib/python3.8/site-packages/torch/nn/functional.py", line 2056, in batch_norm
    return torch.batch_norm(
RuntimeError: CUDA out of memory. Tried to allocate 12.00 MiB (GPU 0; 3.82 GiB total capacity; 1.95 GiB already allocated; 17.31 MiB free; 2.05 GiB reserved in total by PyTorch)

Here are a bit information about my system. Thanks a lot for having a look:

$ nvidia-smi
Thu Feb  4 15:27:57 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 165...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   48C    P5     7W /  N/A |    936MiB /  3911MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1182      G   /usr/lib/xorg/Xorg                133MiB |
|    0   N/A  N/A      1786      G   /usr/lib/xorg/Xorg                520MiB |
|    0   N/A  N/A      1962      G   /usr/bin/gnome-shell               88MiB |
|    0   N/A  N/A      2330      G   ...gAAAAAAAAA --shared-files      173MiB |
|    0   N/A  N/A      2593      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      5682      G   /usr/lib/firefox/firefox            1MiB |
|    0   N/A  N/A      5919      G   /usr/lib/firefox/firefox            1MiB |
+-----------------------------------------------------------------------------+

and

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

and

(torchenc) mona@goku:~$ python
Python 3.8.5 (default, Jul 28 2020, 12:59:40) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.7.1'

and

$ lsb_release -a
LSB Version:	core-11.1.0ubuntu2-noarch:security-11.1.0ubuntu2-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.2 LTS
Release:	20.04
Codename:	focal

Please let me know if further information might be necessary.

If a single sample is raising the out of memory issue, your GPU memory capacity might just be too small for this workload.
If you have another workstation with a bigger GPU, you could try to run it there and check the memory usage for a single sample.
Alternatively, you could also try to run it on Colab, as you might also get a GPU with more than 4GB of memory.

1 Like