CuDNN error with LSTMs and PackedSequences in Pytorch 1.10

I’ve come across a problem when using PackedSequences and LSTMs:

Traceback (most recent call last):                                                                                                                            
  File "thing.py", line 23, in <module>                                                                                                                       
    out, hidden = lstm(packed)                                                                                                                                
  File "/users/mmazuecos/miniconda3/envs/confirm_it/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__                            
    result = self.forward(*input, **kwargs)                                                                                                                   
  File "/users/mmazuecos/miniconda3/envs/confirm_it/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 562, in forward                                
    return self.forward_packed(input, hx)                                                                                                                     
  File "/users/mmazuecos/miniconda3/envs/confirm_it/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 554, in forward_packed                         
    output, hidden = self.forward_impl(input, hx, batch_sizes, max_batch_size, sorted_indices)                                                                
  File "/users/mmazuecos/miniconda3/envs/confirm_it/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 529, in forward_impl                           
    self.num_layers, self.dropout, self.training, self.bidirectional)                                                                                         
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

I’ve been looking for this errors in a couple of places, mainly this discussion and this one. Most of the solutions were reinstalling pytorch, but it was of no use.

The problem does not happen when I disable the CuDNN backend or run in CPU, but happens with every compatible version of PyTorch, CUDA and CuDNN I was able to try on when using CuDNN enabled.

This is the minimal code that reproduces my problem:

import torch                                                                                                                                                  
import torch.nn as nn                                                                                                                                         
                                                                                                                                                              
from torch.nn.utils.rnn import pack_padded_sequence                                                                                  
                                                                                                                                                              
torch.manual_seed(1)                                                                                                                                          
                                                                                                                                                              
lstm = nn.LSTM(3, 3).cuda()                                                                                                                                   
inputs = torch.randn((3,3,3)).to(device='cuda')                                                                                                               
                                                                                                                                                              
                                                                                                                                                              
packed = pack_padded_sequence(inputs, [3,3,3])                                                                                                                
out, hidden = lstm(packed)                                                                                                                                    
                                                                                                                                                              
print(out)                                                                                                                                                    
print(hidden)

My current setting is:

  • PyTorch 1.10.2
  • CUDA 10.2
  • CuDNN 7.6.5
  • Nvidia A30

I came around the same problem using

  • PyTorch 1.2.0
  • CUDA 10.0
  • CuDNN 7.6.5
  • Nvidia A10

Maybe I’m missing something? Using PackedSequences used to run smoothly according to the authors of the code I was trying to use originally.

Thank beforehand for your help!

Your Ampere GPUs would need to use CUDA11 while it seems you have installed the CUDA10 binaries with cuDNN7.6.5, which is expected to fail.
Install the latest pip wheels or conda binaries with CUDA11.3 (or 11.5) and check if you are still hitting the issue.

Installing PyTorch 1.10.2 with cudatoolkit 11.1 (which was the only 11.X version available in the server I’m using) did the trick!

Thanks a lot for the reply!