Like a few other posts on this board, I’m trying to understand pad_packed sequence. Here’s a simple example:
>>> import torch
>>> from torch.nn.utils.rnn import pad_packed_sequence, pack_padded_sequence
>>> x = torch.LongTensor([[1,2,3], [1,0,0]]).view(2, 3, 1)
>>> print(x)
(0 ,.,.) =
1
2
3
(1 ,.,.) =
1
0
0
[torch.LongTensor of size 2x3x1]
>>> lens = [3, 1]
>>> y = pack_padded_sequence(x, lens, batch_first=True)
>>> print(y)
PackedSequence(data=
1
1
2
3
[torch.LongTensor of size 4x1]
, batch_sizes=[2, 1, 1])
Why is batch_sizes [2,1,1] vs. [3,1]? Am I wrong that batch_sizes should be the lengths of my data vectors? I haven’t been able to find a 5 line example that correctly demonstrates and explains the appropriate usage of pack_padded_sequence. Any short example and quick explanation would be extremely helpful.