Mmdetection demo returning: RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

  1. I’ve attempted an install of this repo: GitHub - sandipan211/ZSD-SC-Resolver: Resolving semantic confusions for improved zero-shot detection (BMVC 2022)
  2. This work used a Linux environment, which I made every effort to reproduce under Windows:

PyTorch version: 1.1.0
Is debug build: False
CUDA used to build PyTorch: 9.0
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Enterprise
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.7.3 (default, Apr 24 2019, 15:29:51) [MSC v.1915 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19041-SP0
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: Quadro T2000
Nvidia driver version: 527.41
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: No

Versions of relevant libraries:
[pip3] numpy==1.21.5
[pip3] torch==1.1.0
[pip3] torchvision==0.3.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 9.0 1
[conda] mkl 2021.4.0 haa95532_640
[conda] mkl-service 2.4.0 py37h2bbff1b_0
[conda] mkl_fft 1.3.1 py37h277e83a_0
[conda] mkl_random 1.2.2 py37hf11a4ad_0
[conda] numpy 1.21.5 py37h7a0a035_3
[conda] numpy-base 1.21.5 py37hca35cd5_3
[conda] pytorch 1.1.0 py3.7_cuda90_cudnn7_1 pytorch
[conda] torchvision 0.3.0 pypi_0 pypi

  1. Trying to run an example Jupyter notebook under the mmdetection/demo folder, with a basic image detection exercise, I keep reaching the dreaded “RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED” error:

~\Miniconda3\envs\zsd1\lib\site-packages\torch\nn\modules\conv.py in forward(self, input)
336 _pair(0), self.dilation, self.groups)
337 return F.conv2d(input, self.weight, self.bias, self.stride,
→ 338 self.padding, self.dilation, self.groups)
339
340

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

  1. What might be some root causes of this issue? An obsolete cuDNN version? An unsupported GPU? Something else? Would so appreciate any ideas.

PyTorch 1.1.0 with CUDA 9.0 and cuDNN 7.1 is quite old by now so could you update to the latest release (1.13.1) and check if you are still seeing the error?

Yes, thank you, updating enabled me to make some progress. I encountered issues with AT_CHECK errors but was able to fix them in all cpp files with

#ifndef AT_CHECK
#define AT_CHECK TORCH_CHECK
#endif