Unable to find a valid cuDNN algorithm to run convolution

jscriptcoder · April 27, 2020, 8:43pm

I just got this message when trying to run a feed forward torch.nn.Conv2d, getting the following stacktrace:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-26-04bd4a00565d> in <module>
      3 
      4 # call training function
----> 5 losses = train(D, G, n_epochs=n_epochs)

<ipython-input-24-b539315e0aa0> in train(D, G, n_epochs, print_every)
     46                 real_images = real_images.cuda()
     47 
---> 48             D_real = D(real_images)
     49             d_real_loss = real_loss(D_real, True) # smoothing label 1 => 0.9
     50 

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

<ipython-input-14-bf68e57c25ff> in forward(self, x)
     48         """
     49 
---> 50         x = self.leaky_relu(self.conv1(x))
     51         x = self.leaky_relu(self.conv2(x))
     52         x = self.leaky_relu(self.conv3(x))

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    347 
    348     def forward(self, input):
--> 349         return self._conv_forward(input, self.weight)
    350 
    351 class Conv3d(_ConvNd):

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    344                             _pair(0), self.dilation, self.groups)
    345         return F.conv2d(input, weight, self.bias, self.stride,
--> 346                         self.padding, self.dilation, self.groups)
    347 
    348     def forward(self, input):

RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

I’m using Python 3.7, Pytorch 1.5, and GPU is Nvidia GeForce GTX 770, running on Ubuntu 18.04.2. Does it ring any bell?.

Running nvidia-smi shows:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 770     On   | 00000000:01:00.0 N/A |                  N/A |
| 38%   50C    P8    N/A /  N/A |    624MiB /  4034MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0                    Not Supported                                       |
+-----------------------------------------------------------------------------+

Thanks a lot in advance.

ptrblck · April 28, 2020, 6:45am

Depending on the used device, the cudnn version, as well as the parameters of the convolution you might now see this error if no valid algorithm can be found from cudnn (instead of a CUDNN_NOT_SUPPORTED error or some-such).

Are you using torch.backends.cudnn.benchmark = True?
If not, you could try to activate it to use the cudnn heuristics and potentially query more algorithms.

If that doesn’t help, you would have to fall back to the native implementations by disabling cudnn via: torch.backends.cudnn.enabled = False.

Also, could you post the creation of the convolution layer and the input shape, please?
I would like to verify that indeed no algo can be found.

jscriptcoder · April 28, 2020, 12:37pm

Thanks @ptrblck for the quick answer. So this is how I’m creating these layers:

# helper conv function
def conv(in_channels, out_channels, kernel_size, stride=2, padding=1, batch_norm=True):
    """
    Creates a convolutional layer, with optional batch normalization.
    
    Args:
        in_channels (int): 
            Number of channels in the input image
        out_channels (int): 
            Number of channels produced by the convolution
        kernel_size (int or tuple): 
            Size of the convolving kernel
        stride (int or tuple, optional): 
            Stride of the convolution. Default: 2
        padding (int or tuple, optional): 
            Zero-padding added to both sides of the input. Default: 1
        batch_norm (bool):
            Whether or not to add batch normalization. Default: True
    
    Return:
        Sequential list of layers
    """
    
    layers = []
    conv_layer = nn.Conv2d(in_channels, out_channels, 
                           kernel_size, stride, padding, 
                           bias=False)
    
    # append conv layer
    layers.append(conv_layer)

    if batch_norm:
        # append batchnorm layer
        layers.append(nn.BatchNorm2d(out_channels))
     
    # using Sequential container
    return nn.Sequential(*layers)



class Discriminator(nn.Module):
    """
    The inputs to the discriminator are 3 (channels) x image_size x image_size tensor images.
    The output should be a single value that will indicate whether a given image is real or fake.
    """

    def __init__(self, conv_dim):
        """
        Initialize the Discriminator Module
        Args:
            conv_dim (int): The depth of the first convolutional layer
        """
        super(Discriminator, self).__init__()

        self.conv_dim = conv_dim
        
        # 3x64x64
        self.conv1 = conv(in_channels=3, 
                          out_channels=conv_dim, 
                          kernel_size=4, 
                          batch_norm=False)
        # 32x32x32
        self.conv2 = conv(in_channels=conv_dim, 
                          out_channels=conv_dim*2, 
                          kernel_size=4)
        # (32x2)x16x16
        self.conv3 = conv(in_channels=conv_dim*2, 
                          out_channels=conv_dim*4, 
                          kernel_size=4)
        # (32x4)x8x8
        self.conv4 = conv(in_channels=conv_dim*4, 
                          out_channels=conv_dim*8, 
                          kernel_size=4)
        # (32x8)x4x4
        self.fc = nn.Linear(conv_dim*8*4*4, 1) # -> output: 1
        
        self.leaky_relu = nn.LeakyReLU(0.2)
        
    def forward(self, x):
        """
        Forward propagation of the neural network
        
        Args:
            x (Tensor): The input to the neural network     
        
        Return:
            Discriminator logits; the output of the neural network
        """
        
        x = self.leaky_relu(self.conv1(x))
        x = self.leaky_relu(self.conv2(x))
        x = self.leaky_relu(self.conv3(x))
        x = self.leaky_relu(self.conv4(x))
        
        # flatten
        x = x.view(-1, self.conv_dim*8*4*4)
        
        x = self.fc(x)
        
        return x

When I try torch.backends.cudnn.benchmark = True I get the error: RuntimeError: no valid convolution algorithms available in CuDNN. I also forgot to mention that I also get a warning:

/home/francisco/anaconda3/lib/python3.7/site-packages/torch/cuda/__init__.py:87: UserWarning: 
    Found GPU0 GeForce GTX 770 which is of cuda capability 3.0.
    PyTorch no longer supports this GPU because it is too old.
    The minimum cuda capability that we support is 3.5.
    
  warnings.warn(old_gpu_warn % (d, name, major, capability[1]))

Also tried torch.backends.cudnn.benchmark = False, I get: RuntimeError: Unable to find a valid cuDNN algorithm to run convolution… same as initially.

So as the warning says, no support for this old GPU, huh?, anything I can do without having to buy a new GPU?

Thanks a lot.

ptrblck · April 28, 2020, 7:30pm

For now you could disable cudnn for this setup via torch.backends.cudnn.enabled = False while I’m trying to reproduce this issue.
Since your GPU is quite old, I would have to look for a system with a similar device.

jscriptcoder · April 30, 2020, 7:38am

Hi @ptrblck…

thanks a lot for your efforts. Not even that worked, torch.backends.cudnn.enabled = False. I got the following error:

RuntimeError: cuda runtime error (209) : no kernel image is available for execution on the device at /opt/conda/conda-bld/pytorch_1587428266983/work/aten/src/THC/generic/THCTensorMath.cu:19

I read somewhere I could try to compile Pytorch from source in my machine. Would that work?

Thanks again.

ptrblck · April 30, 2020, 7:44am

Oh, you are right. I haven’t checked the compute capability of your device.
The PyTorch binaries ship for GPUs with compute capability >= 3.7, while your GTX 770 has 3.0, so you would need to build PyTorch from source.
You can find the instructions here.

jscriptcoder · April 30, 2020, 7:53am

Great. I’ll give it a try. Thanks sooo much again @ptrblck. Super helpful

jscriptcoder · May 1, 2020, 7:22am

So I followed the steps to build from source, @ptrblck. All fine. Something I noticed after the build finished is that now I have in my anaconda env Pytorch 1.2 installed… Anyway, I’m now getting this error:

RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR.

Could CUDA version I have installed have something to do with this?, I have CUDA 10.2. I’m wondering because there is no mention about this version in the steps, but instead CUDA 10.0 or 10.1. When I installed the dependencies for the build, I ran conda install -c pytorch magma-cuda102 since my CUDA version is 10.2. But in the steps there is only mention about magma-cuda90 | magma-cuda92 | magma-cuda100 | magma-cuda101.

Also while googling around I found some extra env variables I did not set, such as:

export CUDA_NVCC_EXECUTABLE="/usr/local/cuda-x.y/bin/nvcc"
export CUDA_HOME="/usr/local/cuda-x.y"
export CUDNN_INCLUDE_PATH="/usr/local/cuda-x.y/include/"
export CUDNN_LIBRARY_PATH="/usr/local/cuda-x.y/lib64/"
export LIBRARY_PATH="/usr/local/cuda-x.y/lib64"

# and also
export USE_CUDA=1 USE_CUDNN=1 USE_MKLDNN=1

where x.y is CUDA version. They aren’t mentioned in the steps provided by the repo. Are these important? I didn’t get any error and all compiled just fine.

Thanks a lot.

ptrblck · May 2, 2020, 3:32am

The log should have given you the found CUDA and cudnn version.
You could also check it via print(torch.version.cuda) and print(torch.backends.cudnn.version()).

Could you post a (minimal) code snippet to reproduce the CUDNN_MAPPING_ERROR, please?

jscriptcoder · May 2, 2020, 2:56pm

Running those prints I get:

print(torch.__version__) # => 1.2.0
print(torch.version.cuda) # => 10.0.130
print(torch.backends.cudnn.version()) # => 7602

which is strange. Cuda version 10.0.130? I’m pretty sure I have installed 10.2.

So error happens again when trying to run torch.nn.Conv2d:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-22-04bd4a00565d> in <module>
      3 
      4 # call training function
----> 5 losses = train(D, G, n_epochs=n_epochs)

<ipython-input-21-b539315e0aa0> in train(D, G, n_epochs, print_every)
     46                 real_images = real_images.cuda()
     47 
---> 48             D_real = D(real_images)
     49             d_real_loss = real_loss(D_real, True) # smoothing label 1 => 0.9
     50 

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

<ipython-input-11-bf68e57c25ff> in forward(self, x)
     48         """
     49 
---> 50         x = self.leaky_relu(self.conv1(x))
     51         x = self.leaky_relu(self.conv2(x))
     52         x = self.leaky_relu(self.conv3(x))

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

~/.local/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
     90     def forward(self, input):
     91         for module in self._modules.values():
---> 92             input = module(input)
     93         return input
     94 

~/.local/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    545             result = self._slow_forward(*input, **kwargs)
    546         else:
--> 547             result = self.forward(*input, **kwargs)
    548         for hook in self._forward_hooks.values():
    549             hook_result = hook(self, input, result)

~/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
    341 
    342     def forward(self, input):
--> 343         return self.conv2d_forward(input, self.weight)
    344 
    345 class Conv3d(_ConvNd):

~/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py in conv2d_forward(self, input, weight)
    338                             _pair(0), self.dilation, self.groups)
    339         return F.conv2d(input, weight, self.bias, self.stride,
--> 340                         self.padding, self.dilation, self.groups)
    341 
    342     def forward(self, input):

RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

Line 50 is calling x = self.leaky_relu(self.conv1(x))… self.conv1(x) is calling the following function:

# helper conv function
def conv(in_channels, out_channels, kernel_size, stride=2, padding=1, batch_norm=True):
    """
    Creates a convolutional layer, with optional batch normalization.
    
    Args:
        in_channels (int): 
            Number of channels in the input image
        out_channels (int): 
            Number of channels produced by the convolution
        kernel_size (int or tuple): 
            Size of the convolving kernel
        stride (int or tuple, optional): 
            Stride of the convolution. Default: 2
        padding (int or tuple, optional): 
            Zero-padding added to both sides of the input. Default: 1
        batch_norm (bool):
            Whether or not to add batch normalization. Default: True
    
    Return:
        Sequential list of layers
    """
    
    layers = []
    conv_layer = nn.Conv2d(in_channels, out_channels, 
                           kernel_size, stride, padding, 
                           bias=False)
    
    # append conv layer
    layers.append(conv_layer)

    if batch_norm:
        # append batchnorm layer
        layers.append(nn.BatchNorm2d(out_channels))
     
    # using Sequential container
    return nn.Sequential(*layers)

It blows up in nn.Conv2.

I’m still wondering why torch.version.cuda says 10.0.130. I could try to installed that version of CUDA instead?

Thanks a lot for the help.

ptrblck · May 3, 2020, 12:48am

It you’ve installed a PyTorch binary, the local CUDA version will not be used.
Uninstall all binary installations and try to rebuild PyTorch with your local libs.

Let me know, if you still see the cudnn error and we’ll try to reproduce it.

jscriptcoder · May 7, 2020, 10:34am

Hi @ptrblck.

I finally got it working!!. I did some extra steps. Not sure if it was needed though but I set a couple of env variables pointing to where I had cuda and cudnn libraries:

export CUDA_HOME="/usr/local/cuda-10.2"
export CUDA_NVCC_EXECUTABLE="/usr/local/cuda-10.2/bin/nvcc"
export CUDNN_INCLUDE_PATH="/usr/local/cuda-10.2/include/"
export CUDNN_LIBRARY_PATH="/usr/local/cuda-10.2/lib64/"
export LIBRARY_PATH="/usr/local/cuda-10.2/lib64"
export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
export USE_CUDA=1 USE_CUDNN=1 USE_MKLDNN=1

After building I got these prints:

print(torch.__version__) # => 1.6.0a0+76c964d
print(torch.version.cuda) # => 10.2
print(torch.backends.cudnn.version()) # => 7605

Then I ran into an issue since I’m using also torchvision package. Installing it wasn’t an option since it was destroying my freshly compiled pytorch. Searching through the forum I ended up in this discussion, Error when building pytorch from source, where someone had similar issue. You actually suggested building also torchvision from source.

After all this, I no longer have this problem and finally I can use my old GPU .

Thanks a lot for your help.

ptrblck · May 7, 2020, 7:27pm

That’s great news!

jscriptcoder · May 8, 2020, 8:19am

There was a detail I think is important to mention and leave here, in this thread, in case someone else runs into similar issues.

When I tried to build torchvision package from source, I got the following error: ValueError: Unknown CUDA arch (3.0) or GPU not supported. The way to fix it, which I found also by searching in this forum but I can’t remember the thread, was to simply edit the file: /site-packages/torch/utils/cpp_extension.py, which is hosted in your anaconda environment, where the result of building pytorch was sent to. There you just need to find the line that contains the list:

supported_arches = ['3.5', '3.7', '5.0', '5.2', '5.3', '6.0', '6.1', '6.2',
                        '7.0', '7.2', '7.5']

… and add '3.0'. After that you can now build torchvision

ptrblck · May 8, 2020, 5:27pm

Thanks for the follow up!
Not sure, but maybe using TORCH_CUDA_ARCH_LIST="3.0" python setup.py install could also work.
Anyway, thanks for your solution which certainly works.

russellyq · September 29, 2020, 2:36am

Hi ! I have met the same problem !
May I ask how to rebuild PyTorch with local libs ?

iremonur · October 22, 2020, 9:10pm

Hi, I have also an old GPU which has Cuda capability=3.0. I installed PyTorch and torchvision from source, successfully. But I also have to install OpenCV in my conda environment for my projects, therefore after installing torch and torchvision from source, I tried to install OpenCV by running “conda install -c menpo OpenCV”, OpenCV is installed but then I tried to import torch, I got this error: “No module named torch”. I went back to pytorch directory, compiled again, then I tried to import torch and at this time it said: “Segmentation fault (core dumped)”. I want to work with torch, torchvision and opencv by using my old GPU. Please help me, what should I do? Should I install OpenCv from source, too ?

ptrblck · October 22, 2020, 11:45pm

I don’t know, if OpenCV might have uninstalled PyTorch for some reason, but it would be shown during the installation.
I would try to create a new environment, rebuild PyTorch, and make sure it’s working.
Afterwards, you could check, if the conda OpenCV installation would remove PyTorch again and if so build OpenCV from source.

I guess you are also using an older PyTorch branch, since the current one would support a minimal compute capability of 3.5.

iremonur · October 23, 2020, 6:33am

Thank you very much. As I understood, OpenCv installation does not remove PyTorch but it downgrades the Python version. I create conda environment with Python 3.8.5, i installed torch and torchvision from source, successfully but when I installed OpenCV from source, python version of the anaconda environment downgrades to 3.7.6, and when i try to install OpenCV via “conda install” python version of the anaconda environment downgrades to 3.6. And at both of them, torch can not be imported. I think that may be the problem because when i try to create a conda environment with Python 3.7 at first step, i could not install pytorch from source. Does that make sense ? Does Anaconda environment has to be created with Python 3.8.5 for installing torch and torchvision from source ?

Note :
I cloned PyTorch master branch, then checked out pull request #46535, added ‘3.0’ to supported_archs in /home/ionur1/pytorch/torch/utils/cpp_extension.py and i compiled. After installing pytorch from source, i installed torchvision from source.

ptrblck · October 23, 2020, 8:53am

I don’t think that’s a true limitation, as PyTorch should build with Python=3.7.
Which error were you getting?