CUDNN error using pretrained vgg16 model

mordith · September 24, 2017, 12:59pm

Hi all,

I am trying to extract the activations of the last layer in a VGG16 model.
For that end I used a decorator over the model as shown below.

When I pass a cuda tensor to the model I get a CUDNN_STATUS_INTERNAL_ERROR with the following traceback.

Anyone knows where I went wrong?

traceback:

  File "/media/data1/iftachg/frame_glimpses/parse_files_to_vgg.py", line 80, in get_activation
    return model(image)
  File "/media/data1/iftachg/miniconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/data1/iftachg/frame_glimpses/partial_vgg.py", line 24, in forward
    x = self.vgg16.features(x)
  File "/media/data1/iftachg/miniconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/data1/iftachg/miniconda2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 64, in forward
    input = module(input)
  File "/media/data1/iftachg/miniconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/data1/iftachg/miniconda2/lib/python2.7/site-packages/torch/nn/modules/conv.py", line 237, in forward
    self.padding, self.dilation, self.groups)
  File "/media/data1/iftachg/miniconda2/lib/python2.7/site-packages/torch/nn/functional.py", line 39, in conv2d
    return f(input, weight, bias)
RuntimeError: CUDNN_STATUS_INTERNAL_ERROR

Class:

class partial_vgg(nn.Module):

    def __init__(self):
        super(partial_vgg, self).__init__()
        self.vgg16 = models.vgg16(pretrained=True).cuda()
        for param in self.vgg16.parameters():
            param.requires_grad = False

    def forward(self, x):

        x = self.vgg16.features(x)
        x = x.view(x.size(0), -1)
        for l in list(self.vgg16.classifier.children())[:-3]:
            x = l(x)
        return x

chenyuntc · September 25, 2017, 10:27am

I guess the shape of x doesn’t match vgg.
try

x = self.vgg16.features.cpu()(x.cpu())

you’ll find more information about the error

mordith · September 28, 2017, 10:19am

Thanks for the advice

Apparently cudnn errors are extremely unhelpful and there was no problem with the code itself - it is simply the GPUs I was trying to access were already in use.