You could pack the input tensors with different length using e.g. torch.nn.utils.rnn.pack_sequence
and later pad them to the longest sequence via torch.nn.utils.rnn.pad_packed_sequence
.
@vdw has also posted an approach without padding, where the input sequences are sorted to avoid padding.