How could I pad packed data based on length

I have a packed data and each sequences’ length.
Example:

data = torch.tensor([4, 1, 3, 5, 2, 6])
lengths = torch.tensor([2,1,3])

I want to create a pad 2-D (batch_size,max_lengths) matrix like:

output = torch.tensor([[4,1,0], #length=2
                       [3,0,0],#length=1
                       [5,2,6])#length=3

And due to my training purpose, this operation should be able to track backward gradient if data.requires_grad=True.

pad_sequence should work:

data = torch.tensor([4, 1, 3, 5, 2, 6], dtype=torch.float32, requires_grad=True)
lengths = torch.tensor([2,1,3])

x = data.split(torch.tensor_split(lengths, len(lengths)))
out = torch.nn.utils.rnn.pad_sequence(x, batch_first=True)
print(out)
# tensor([[4., 1., 0.],
#         [3., 0., 0.],
#         [5., 2., 6.]], grad_fn=<CopySlices>)

out.mean().backward()
print(data.grad)
# tensor([0.1111, 0.1111, 0.1111, 0.1111, 0.1111, 0.1111])