Dropout in bidirectional nn.LSTM

Hi,

i did like to know how dropout placed on bidirectional nn.LSTM, is dropout applied every bidirectional layer or still on RNN layer?

for example if i set 3 layer

bi_lstm = nn.LSTM(1280, 256, num_layers=3, dropout=0.5, bidirectional=True)

will it become 5 dropout like

LSTM_L0_Forward
Dropout_L0_Forward
LSTM_L0_Reverse
Dropout_L0_Reverse
Cat(L0_Forward, L0_Reverse)
LSTM_L1_Forward
Dropout_L1_Forward
LSTM_L1_Reverse
Dropout_L1_Reverse
Cat(L1_Forward, L1_Reverse)
LSTM_L2_Forward
Dropout_L2_Forward
LSTM_L2_Reverse
Cat(L2_Forward, L2_Reverse)

or 2 dropout

LSTM_L0_Forward
LSTM_L0_Reverse
Cat(L0_Forward, L0_Reverse)
Dropout_Cat_L0
LSTM_L1_Forward
LSTM_L1_Reverse
Cat(L1_Forward, L1_Reverse)
Dropout_Cat_L1
LSTM_L2_Forward
LSTM_L2_Reverse
Cat(L2_Forward, L2_Reverse)

or

LSTM_L0_Forward
LSTM_L0_Reverse
Dropout_L0_Reverse
Cat(L0_Forward, L0_Reverse)
LSTM_L1_Forward
LSTM_L1_Reverse
Dropout_L1_Reverse
Cat(L1_Forward, L1_Reverse)
LSTM_L2_Forward
LSTM_L2_Reverse
Cat(L2_Forward, L2_Reverse)

anyone who can answer?