Dropout in nn.RNN

ypxie · May 9, 2017, 6:15am

does it use the same dropout mask for every timestep? if not how to make it work that way while still maintaining the same performance as using nn.RNN?

This type of dropout is better according to the following paper, and it is also the dropout used in keras.

Thanks.

WERush · May 9, 2017, 12:22pm

During testing, the dropout is turned off. The performance would not be affected by the dropout.

jekbradbury · May 10, 2017, 7:45am

No, it doesn’t, because PyTorch’s RNNs are thin wrappers around cuDNN, which doesn’t support time-locked dropout masks. Users can implement it themselves, though at the cost of reduced speed due to inability to use the optimized cuDNN kernel.