Nested list of variable length to a tensor

nikhilweee · August 20, 2019, 8:50am

I might be a bit late to the party, but after realizing that pytorch won’t spoonfeed me anymore, I ended up writing my own function to pad a list of tensors.

The following function takes a nested list of integers and converts it into a padded tensor.

def ints_to_tensor(ints):
    """
    Converts a nested list of integers to a padded tensor.
    """
    if isinstance(ints, torch.Tensor):
        return ints
    if isinstance(ints, list):
        if isinstance(ints[0], int):
            return torch.LongTensor(ints)
        if isinstance(ints[0], torch.Tensor):
            return pad_tensors(ints)
        if isinstance(ints[0], list):
            return ints_to_tensor([ints_to_tensor(inti) for inti in ints])

This relies on another function pad_tensors described below:

def pad_tensors(tensors):
    """
    Takes a list of `N` M-dimensional tensors (M<4) and returns a padded tensor.

    The padded tensor is `M+1` dimensional with size `N, S1, S2, ..., SM`
    where `Si` is the maximum value of dimension `i` amongst all tensors.
    """
    rep = tensors[0]
    padded_dim = []
    for dim in range(rep.dim()):
        max_dim = max([tensor.size(dim) for tensor in tensors])
        padded_dim.append(max_dim)
    padded_dim = [len(tensors)] + padded_dim
    padded_tensor = torch.zeros(padded_dim)
    padded_tensor = padded_tensor.type_as(rep)
    for i, tensor in enumerate(tensors):
        size = list(tensor.size())
        if len(size) == 1:
            padded_tensor[i, :size[0]] = tensor
        elif len(size) == 2:
            padded_tensor[i, :size[0], :size[1]] = tensor
        elif len(size) == 3:
            padded_tensor[i, :size[0], :size[1], :size[2]] = tensor
        else:
            raise ValueError('Padding is supported for upto 3D tensors at max.')
    return padded_tensor

The pad_tensors function only supports tensors of upto 3 dimensions, but that can easily be extended. Using these functions should solve the issue.

As an example, here’s @rustytnt’s input:

In [4]: target = [[[1,2,3], [2,4,5,6]], [[1,2,3], [2,4,5,6], [2,4,6,7,8]]]

In [5]: ints_to_tensor(target)
Out[5]: 
tensor([[[1, 2, 3, 0, 0],
         [2, 4, 5, 6, 0],
         [0, 0, 0, 0, 0]],

        [[1, 2, 3, 0, 0],
         [2, 4, 5, 6, 0],
         [2, 4, 6, 7, 8]]])

And here’s @wangyanda’s input:

 In [6]: target = [[[3,5,4], [8,5], [3]], [[6], [6,4,3,5], [7,5,3]], [[6,5],[2],[2]], [[2],[0],[0]]]

In [7]: ints_to_tensor(target)
Out[7]: 
tensor([[[3, 5, 4, 0],
         [8, 5, 0, 0],
         [3, 0, 0, 0]],

        [[6, 0, 0, 0],
         [6, 4, 3, 5],
         [7, 5, 3, 0]],

        [[6, 5, 0, 0],
         [2, 0, 0, 0],
         [2, 0, 0, 0]],

        [[2, 0, 0, 0],
         [0, 0, 0, 0],
         [0, 0, 0, 0]]])