Adding additional features to an embedded padded sequence

bash94 · May 16, 2020, 7:15pm

Hello everyone,

I would like to add additional features to an embedded padded sequence before I pass the data to an RNN as a packed sequence.

For example:

Let’s say I have the following embedding layer and a simple bidirectional RNN:

embedding = nn.Embedding(5, 5, padding_idx=0)
rnn = nn.RNN(5, 5, batch_first=True, bidirectional=True)

and let’s say I have the following input:

input = torch.tensor([[1,2,3,4,0], [1,2,3,0,0]], dtype=torch.long)
lengths = torch.tensor([4, 3], dtype=torch.long)

What should be done if I want pass a packed sequence to the RNN is the following:

embedded_seqs = embedding(input, padding_idx=0)
packed_seqs = pack_padded_sequence(embedded_seqs, lengths, batch_first=True)
packed_output, hidd = rnn(packed_seqs)

Now let’s say I have additional features that I would like to add to embedded_seqs before I create packed_seqs:

additional_features = torch.randn(2, 5, 5)

What is the best way to do this? Should I do something like this:

embedded_seqs = embedding(input, padding_idx=0)
all_features = torch.cat((embedded_seqs, additional_features), dim=2)
packed_seqs = pack_padded_sequence(all_features, lengths, batch_first=True)
packed_output, hidd = rnn(packed_seqs)

I think if I do the above, all the features that are going to be added after the padding will be neglected. Is there a better way to do this? Thank you so much!