Implementation of torch `pad_sequence` function seems like it will generate errors?

00krishna · November 18, 2021, 4:19pm

I was looking at the implementation of the torch torch.nn.utils.pad_sequence function, which is designed to pad a sequence with a specified padding value if the sequence is less than the length of the longest example in the batch.

Perhaps I am not understanding something, but won’t this implementation create problems because different batches may have different length sequences? If I have a bunch of text sequences, then batch number 1 might have a longest sequence of 10, while batch 2 might have a sequence of length 30. So the neural network will likely expect fixed sized batches right? Hence I was wondering why the function was written this way, instead of specifying a fixed length for the sequence length, and then padding any sequence that was less than this length?

Is there a different version of this padding function in pytorch that allows the user to specify the sequence length, and hence avoid these errors with different sized batches? Thanks.

Markizano · November 18, 2025, 12:21am

I know this question is like 4 years old by now, but I’m searching around and finding little answers around this! Curious to know as well!

vdw · November 18, 2025, 1:52am

Basically all sequence models, mainly RNNs and Transformers, require only that the sequences in the same batch are of equal length, so that all resulting tensors are “full“.

However, different batches may have different maximum lengths.