# How to perform repeat padding for variable length data?

I have variable length data and want to pack it to batches with size of max sample len in batch repeating shorter samples.
For example like this
[[0, 1, 2, 3, 4], [0, 1, 2]] => [[0, 1, 2, 3, 4], [0, 1, 2, 0, 1]]

You could use some rnn util functions:

``````x = [torch.tensor([0, 1, 2, 3, 4]), torch.tensor([0, 1, 2])]
x = torch.nn.utils.rnn.pack_sequence(x)
print(out)
> (tensor([[0, 1, 2, 3, 4],
[0, 1, 2, 0, 0]]), tensor([5, 3]))
``````

Where the first return value will be the padded tensor, while the second will give you the lengths before padding.

Oh sorry, I apparently missed the most important part of the question.
I’m not sure, if there is a function for this, but this code snippet should work:

``````x = [torch.tensor([0, 1, 2, 3, 4]), torch.tensor([0, 1, 2])]
max_len = max([t.size(0) for t in x])
res = [torch.cat((t, t[:max_len-t.size(0)])) for t in x]
``````

With repeat padding my attention even worse since where is now non zero data in padding. I tried to make mask, but got nan in softmax.

Could you explain your use case a bit more regarding the NaN output in softmax?

Well, i wanted to mask attentions also by query axis.
But with default attn_mask setup it case nans.
Google say, its because only -infs in axis.

Now i edited source code of multi head attention forward like this:

``````if attn_mask is not None:
attn_output_weights = attn_output_weights.view(bsz, num_heads, tgt_len, src_len)
attn_output_weights = attn_output_weights.view(bsz * num_heads, tgt_len, src_len)

attn_output_weights = softmax(
attn_output_weights, dim=-1)
``````

``````def get_mask_from_lengths_3d(batch_size, lengths_query, lengths_key, nheads):
lengths_query.max()).cuda()

max_len = torch.max(lengths_key).item()
ids = torch.arange(0, max_len, out=torch.cuda.LongTensor(max_len))
mask[ids > lengths_key.unsqueeze(1) - 1] = 1

max_len = torch.max(lengths_query).item()
ids = torch.arange(0, max_len, out=torch.cuda.LongTensor(max_len))
mask[ids > lengths_query.unsqueeze(1) - 1] = 1

sz = lengths_query.max().item()
).unsqueeze(0).repeat(batch_size, 1, 1)

ids = torch.arange(0, sz, out=torch.cuda.LongTensor(sz))
mask[ids > lengths_query.unsqueeze(1) - 1] = 1

``````

Alignment for one layer seems to be right

With mask value float(-inf) it became nan immediately.

How should i zero pad 2d data?

``````out = {}
x = [torch.randn(10, 10), torch.randn(5, 5)]
x = torch.nn.utils.rnn.pack_sequence(x, enforce_sorted=False)
``````

RuntimeError: The expanded size of the tensor (10) must match the existing size (5) at non-singleton dimension 1. Target sizes: [5, 10]. Tensor sizes: [5, 5]

In your example `dim1` should be equal, so you could pad the second tensor with `F.pad`:

``````import torch.nn.functional as F
F.pad(torch.randn(5, 5), (2, 3, 0, 0))
``````

Note that I’ve used a padding of 2 and 3 for the “left” and “right” side of `dim1`, but you could of course also only pad on one side with 5 values or chose any other valid configuration.

1 Like

I finally tried your snippet, but it do not work if one sample x2 longer then another