Selectively disable LSTM dropout

I have a 2-layer LSTM (torch.nn.LSTM) model, with dropout enabled.
For a task, I need the intermediate gradients as well, I do this by using backward hooks.
But when I call .backward(), I get the following error:

cudnn RNN backward can only be called in training mode

I can bypass this by setting the LSTM layer to .train() but this also enables the dropout.
Can this be fixed?

You can call .train() on the nn.LSTM module alone or disable cudnn for this layer.

I wish to ask you a question that how to make the output of the LSTM network in Pytorch become the input of the next step?

I need to move the value output by LSTM to the next input, and then make predictions step by step. How do I describe the code?

For the LSTM dropout, I use the dropout parameter of nn.LSTM. When I call .train() on nn.LSTM, the dropout is enabled as well.

In that case you could still disable cudnn for this layer only or set the dropout to zero after calling .train() on it.