I was trying to use the built in padding function but it wasn’t padding things for me for some reason. This is my reproducible code:
import torch
def padding_batched_embedding_seq():
## 3 sequences with embedding of size 300
a = torch.ones(1, 4, 5) # 25 seq len (so 25 tokens)
b = torch.ones(1, 3, 5) # 22 seq len (so 22 tokens)
c = torch.ones(1, 2, 5) # 15 seq len (so 15 tokens)
##
sequences = [a, b, c]
batch = torch.nn.utils.rnn.pad_sequence(sequences)
if __name__ == '__main__':
padding_batched_embedding_seq()
error message:
Traceback (most recent call last):
File "padding.py", line 51, in <module>
padding_batched_embedding_seq()
File "padding.py", line 40, in padding_batched_embedding_seq
batch = torch.nn.utils.rnn.pad_sequence(sequences)
File "/Users/rene/miniconda3/envs/automl/lib/python3.7/site-packages/torch/nn/utils/rnn.py", line 376, in pad_sequence
out_tensor[:length, i, ...] = tensor
RuntimeError: The expanded size of the tensor (4) must match the existing size (3) at non-singleton dimension 1. Target sizes: [1, 4, 5]. Tensor sizes: [3, 5]
any idea?