Dropout in RNN for each layer

I’m currently reading c++ implementation of RNN in ATen (https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/RNN.cpp). Based on the following implementation, is my understanding correct that in every layer there will be dropout applied with the same mask for every steps?

template<typename io_type, typename hidden_type, typename weight_type>
LayerOutput<io_type, std::vector<hidden_type>>
apply_layer_stack(const Layer<io_type, hidden_type, weight_type>& layer, const io_type& input,
                  const std::vector<hidden_type>& hiddens, const std::vector<weight_type>& weights,
                  int64_t num_layers, double dropout_p, bool train) {
  AT_CHECK(num_layers == hiddens.size(), "Expected more hidden states in stacked_rnn");
  AT_CHECK(num_layers == weights.size(), "Expected more weights in stacked_rnn");

  auto layer_input = input;
  auto hidden_it = hiddens.begin();
  auto weight_it = weights.begin();
  std::vector<hidden_type> final_hiddens;
  for (int64_t l = 0; l < num_layers; ++l) {
    auto layer_output = layer(layer_input, *(hidden_it++), *(weight_it++));
    layer_input = layer_output.outputs;

    if (dropout_p != 0 && train && l < num_layers - 1) {
      layer_input = dropout(layer_input, dropout_p);

  return {layer_input, final_hiddens};
1 Like

I had the same question as you, until I saw this post:Dropout in nn.RNN