Packed sequence strange behaviour

Hello everyone!

I can’t clearly understand how packed sequence works. If we look at the example from the docs:

a = torch.tensor([1, 2, 3])
b = torch.tensor([1, 2])
c = torch.tensor([1])

torch.nn.utils.rnn.pack_sequence([a, b, c])

it will give us something like that:

PackedSequence(data=tensor([ 1, 1, 1, 2, 2, 3]), batch_sizes=tensor([ 3, 2, 1]))

But when I change only one tensor, it gives completely different results:

a = torch.zeros(5)
b = torch.tensor([1, 2])
c = torch.tensor([1])

torch.nn.utils.rnn.pack_sequence([a, b, c])

PackedSequence(data=tensor([ 0., 1., 1., 0., 2., 0., 0., 0.]), batch_sizes=tensor([ 3, 2, 1, 1, 1]))

What’s going on? Why we obtain batch_sizes with 5 elements when we gave only 3 tensors?

The batch_size tensor is telling you that:

  • Your longest sequence is length 5 (because the tensor has five entries)
  • Three of those entries contain data at the 0th position.
  • Two of those entries contain data at the 1st position
  • One of those entries contains data at the 2nd position
  • One of those entries contains data at the 3rd position
  • One of those entries contains data at the 4th position

Those are all true statements:

  • Your longest tensor is a, consisting of five 0’s. (And note the presence and location of the five 0s in the data tensor.)
  • Your tensor c has only 1 element
  • Your tensor b has only 2 elements

Most importantly/intuitively: Batch_sizes are not lengths.