How to keep hidden state of the last layer of packed GRU

Jerome_MASSOT · August 15, 2020, 1:25am

Hi everybody,

I am playing with seq2seq for NMT and I was trying to add several layers to my working GRU model. Unfortunately, I see that the hidden state vectors dimensions is impacted by the number of layers.

If I want to keep only the last GRU layer’s hidden state, I need to truncate my h_nn tensor… but I am lost how to do it …

Or how to how to concatenate the multiple layers to get a final hidden state vectors with the same shape as I was using a monolayer GRU?

Thanks in advance

Best regards

Jerome

def forward(self, x_source, x_lengths):
“”"
Forward pass
:param x_source: input batch of sequences
:param x_lengths: length of each sequence
:return: x_unpacked, x_birnn_h
“”"

    # apply the embedding on the input sequences
    x_embedded = self._source_embedding(x_source)

    # create the packed sequences structure
    x_packed = pack_padded_sequence(
        x_embedded, 
        x_lengths.detach().cpu().numpy(),
        batch_first=True
    )

    # apply the rnn
    x_birnn_out, x_birnn_h = self._birnn(x_packed)
    if self._num_layers>1:
        x_birnn_h = ## TO DO : keep hiddzn for last layer only...

    # permute the dimensions of the hidden state and flatten it
    x_birnn_h = x_birnn_h.permute(1, 0, 2)
    x_birnn_h = x_birnn_h.contiguous().view(x_birnn_h.size(0), -1)

    # unpacked the sequences before to return them
    x_unpacked, _ = pad_packed_sequence(x_birnn_out, batch_first=True)

    return x_unpacked, x_birnn_h