Selectively disable LSTM dropout

conv8d · June 16, 2021, 12:44pm

I have a 2-layer LSTM (torch.nn.LSTM) model, with dropout enabled.
For a task, I need the intermediate gradients as well, I do this by using backward hooks.
But when I call .backward(), I get the following error:

cudnn RNN backward can only be called in training mode

I can bypass this by setting the LSTM layer to .train() but this also enables the dropout.
Can this be fixed?

ptrblck · June 17, 2021, 12:50am

You can call .train() on the nn.LSTM module alone or disable cudnn for this layer.

YH-WEI · June 17, 2021, 2:05am

Hello!
I wish to ask you a question that how to make the output of the LSTM network in Pytorch become the input of the next step?

I need to move the value output by LSTM to the next input, and then make predictions step by step. How do I describe the code?

conv8d · June 17, 2021, 6:23am

For the LSTM dropout, I use the dropout parameter of nn.LSTM. When I call .train() on nn.LSTM, the dropout is enabled as well.

ptrblck · June 17, 2021, 7:16am

In that case you could still disable cudnn for this layer only or set the dropout to zero after calling .train() on it.