RuntimeError: CUDNN_STATUS_MAPPING_ERROR

When I use the code like

class Classify(nn.Module):
    def __init__(self, opt):
        self.convs = [nn.Conv2d(1, opt.kernel_num, (k, opt.word_vec_size), padding=0) for k in opt.kernel_sizes]

    def forward(self, x):
        x = [F.relu(conv(x)).squeeze(3) for conv in self.convs]

I got the problem “RuntimeError: CUDNN_STATUS_MAPPING_ERROR”, and I simple use the single Conv2d function is alright.

Does this kind of model list ([ conv2d, conv2d, …]) have some problem with the cnn mapping?

try below:

self.convs= nn.ModuleList([nn.Conv2d(1, opt.kernel_num, (k, opt.word_vec_size), padding=0) for k in opt.kernel_sizes]
)
3 Likes

I get the same error message. My network has the following architecture:

class CNN(nn.Module):
def __init__(self, num_classes=10):
    super(CNN, self).__init__()
    self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
    self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
    self.conv3 = nn.Conv2d(64, 128, 3, padding=1)
    self.pool = nn.MaxPool2d(2)
    self.classifier = nn.Sequential(
        nn.Linear(128*8*8 , 1024),
        nn.ReLU(True),
        nn.Dropout(),
        nn.Linear(1024, 1024),
        nn.ReLU(True),
        nn.Dropout(),
        nn.Linear(1024, num_classes)
    )

def forward(self, x):
    x = F.relu(self.conv1(x))
    x = self.pool(F.relu(self.conv2(x)))
    x = self.pool(F.relu(self.conv3(x)))
    x = x.view(x.size(0), -1)
    x = self.classifier(x)
    return x

The model works fine on CPU. I am using pytorch 0.1.12_2

I have found my mistake: I just did not know you have to do net.cuda(), too.

I think the problem is that your GPU is too old to supported by PyTorch. I got the same error before and I notice that there was an UserWarning in the middle part of my log. It was solved after I changed my GPU from K420 to GTX1050Ti.

I faced the same problem and later on figured out the pytorch version I installed does not match CUDA version on my machine. I installed newer version of pytorch and it worked.
CUDA version 9.2.88
pytorch 0.4.1

Just leaving this for the sake of future readers:
I got this error when I’ve overclocked my GPU too much so the driver broke. Which also fits previous answers. So if you overclock and got this error, consider lowering the clock.

2 Likes

I have the same error, but I have a different issue than everyone else. Here’s a simple code that reproduces my issue:

import torch
import torch.nn as nn
import numpy as np

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

        self.base_n_filter = 8

        self.conv1 = nn.Conv3d(1, self.base_n_filter, 3, stride=1, padding=1)
        self.conv2 = nn.Conv3d(self.base_n_filter, self.base_n_filter*2, 3, stride=1, padding=1)
        self.conv3 = nn.Conv3d(self.base_n_filter*2, 1, 3, stride=1, padding=1)

    def forward(self, img):
        output1 = self.conv1(img)
        output2 = self.conv2(output1)
        output3 = self.conv3(output2)

        return output3


if __name__ == '__main__':
    img = torch.zeros([1,1,248,248,140]).cuda()
    label = torch.rand_like(img, device=img.device)

    model = Net().cuda()
    output = model(img)

    loss_fn = nn.MSELoss()

    loss = loss_fn(output, label)
    loss.backward()

The error only occurs when I have an input image of a specific size. For example, if I change the input to shape [1,1,250,250,150], the error no longer occurs. So it doesn’t seem like an overclocking issue. Additionally, this error also only occurs when self.base_n_filter is 8 or greater. I’m not using any lists to store my convolutions, and have set my model to cuda.

I am also using up-to-date GPU and cuda. The error occurs on both Tesla K80 and GTX1080Ti, with pytorch 1.2 cudatoolkit=10.0, CUDA/10.0.130, and cudnn/7.6.2.24-CUDA-10.0.130. Also, this error only occurs on Linux machines.

Any help would be truly appreciated!

I met the same thing and I felt strange because I run my code successfully once. In my situation, I reduced the batch size and solved this error.

1 Like

In the code I posted, the batch size would be 1 so I don’t think that’s the problem. But it’s probably related to how only certain input shapes cause the error.

I think you are right.
When I run my codes, I found this error happened again and when I disabled cuDNN, the error disapeared. I think cuDNN may not so flexible for processing data in all shapes. That may be a bug in cuDNN. And as I posted, I found that sometimes using smaller batch_size can avoid this error. However, in these cases, the occupation percentage of GPU memory is very low.

My environment: cuda v10.0.130, cudnn v7.6.5, V100.

1 Like

I got this error because I was doing accumulated gradients (because I want my effective batch size to be larger than what can fit on my gpu at any given time) and this message showed when it tried to backprop the loss… seems like just a memory issue in some cases

Sometimes In my case, CUDA version is so high and torch version is old lead to this problem, you should upgrade to newer version.

I also noticed some data shapes raise this cudnn error. I tried other data shape and it worked. For me, the error actually arose in batchnorm layer while forward pass.

Could you post the input shapes and the batchnorm config as well as the cudnn version and used GPU, which raised this error?

I found out that the issue was something else. When CUDA_LAUNCH_BLOCKING=1 flag is not used with python script, it showed incorrect error CUDNN_STATUS_MAPPING_ERROR (in my case, it said batchnorm failed). However, when the flag is used, I get “RuntimeError: CUDA error: device-side assert triggered” at line X. I manually checked that there was a index out of bounds problem at line X. Correcting it solved the problem.
The real issue is that the code never tells that the problem is “index out of bounds”, but instead says CUDNN_STATUS_MAPPING_ERROR or “RuntimeError: CUDA error: device-side assert triggered”, depending on if you are using CUDA_LAUNCH_BLOCKING=1 or not.

1 Like

Yes, that’s unfortunately caused by the asynchronous CUDA kernel execution.
Generally you should get a better error message by running the code on the CPU.
If that’s not feasible (e.g. since it takes too long) or doesn’t create the error, you would have to rerun the code with CUDA_LAUNCH_BLOCKING=1 to get the right line of code, which creates the error.

2 Likes

In my case, I had a try-catch block inside the forward function. The logic was like this: try to index the input; if the indexing was not possible, just use the array as it is. While it worked for my case, whenever the code went into the exception, after some lines of code it was throwing
CUDNN_STATUS_MAPPING_ERROR. Just removed the try-catch block and there was no error. Just leaving it here in hope that it helps somebody.