How does PyTorch LSTM dropout work?

I’m trying to port a model from Keras to PyTorch and I’m having trouble with the LSTM layers.

In Keras the dropout parameter specifies “Fraction of the units to drop for the linear transformation of the inputs.” Here I believe the linear transforms refers to the input, forget, cell and output gates respectively.

In PyTorch the dropout parameter appears to specify the dropout between LSTM layers. Hence the complaint when you add dropout to a one-layer lstm. There is nothing in-between to apply the dropout too.

What parameters should I give to the LSTM to get the same effect as the dropout parameter in Keras?

1 Like

Hey @Bjorn_Lindqvist , were you able to figure this out? I am facing exactly same problem. Thanks!