Best way to convert a list to a tensor?

Brando_Miranda · February 3, 2021, 9:57pm

hahaha, yea I see I didn’t make a very solid memory/understanding when I wrote that last year (or it’s been to long since?). Perhaps I can finally sort this out in my head, though in my defence, there does seem to be a lot of discussions surrounding this topic (which isn’t helping to digest this):

it seems that the best pytorchthoning solution comes from either knowing torch.cat or torch.stack. In my use case I generate tensors and conceptually need to nest them in lists and eventually convert that to a final tensor (e.g. of size [d1, d2, d3]). I think the easiest solution to my problem append things to a list and then give it to torch.stack to form the new tensor then append that to a new list and then convert that to a tensor by again using torch.stack recursively.

For a non recursive example I think this works…will update with a better example in a bit:

# %%

import torch

# stack vs cat

# cat "extends" a list in the given dimension e.g. adds more rows or columns

x = torch.randn(2, 3)
print(f'{x.size()}')

# add more rows (thus increasing the dimensionality of the column space to 2 -> 6)
xnew_from_cat = torch.cat((x, x, x), 0)
print(f'{xnew_from_cat.size()}')

# add more columns (thus increasing the dimensionality of the row space to 3 -> 9)
xnew_from_cat = torch.cat((x, x, x), 1)
print(f'{xnew_from_cat.size()}')

print()

# stack serves the same role as append in lists. i.e. it doesn't change the original
# vector space but instead adds a new index to the new tensor, so you retain the ability
# get the original tensor you added to the list by indexing in the new dimension
xnew_from_stack = torch.stack((x, x, x, x), 0)
print(f'{xnew_from_stack.size()}')

xnew_from_stack = torch.stack((x, x, x, x), 1)
print(f'{xnew_from_stack.size()}')

xnew_from_stack = torch.stack((x, x, x, x), 2)
print(f'{xnew_from_stack.size()}')

# default appends at the from
xnew_from_stack = torch.stack((x, x, x, x))
print(f'{xnew_from_stack.size()}')

print('I like to think of xnew_from_stack as a \"tensor list\" that you can pop from the front')

print()

lst = []
print(f'{x.size()}')
for i in range(10):
    x += i  # say we do something with x at iteration i
    lst.append(x)
# lstt = torch.stack([x for _ in range(10)])
lstt = torch.stack(lst)
print(lstt.size())

print()

Update: With nested list of the same size

def tensorify(lst):
    """
    List must be nested list of tensors (with no varying lengths within a dimension).
    Nested list of nested lengths [D1, D2, ... DN] -> tensor([D1, D2, ..., DN)

    :return: nested list D
    """
    # base case, if the current list is not nested anymore, make it into tensor
    if type(lst[0]) != list:
        if type(lst) == torch.Tensor:
            return lst
        elif type(lst[0]) == torch.Tensor:
            return torch.stack(lst, dim=0)
        else:  # if the elements of lst are floats or something like that
            return torch.tensor(lst)
    current_dimension_i = len(lst)
    for d_i in range(current_dimension_i):
        tensor = tensorify(lst[d_i])
        lst[d_i] = tensor
    # end of loop lst[d_i] = tensor([D_i, ... D_0])
    tensor_lst = torch.stack(lst, dim=0)
    return tensor_lst

here is a few unit tests (I didn’t write more tests but it worked with my real code so I trust it’s fine. Feel free to help me by adding more tests if you want):


def test_tensorify():
    t = [1, 2, 3]
    print(tensorify(t).size())
    tt = [t, t, t]
    print(tensorify(tt))
    ttt = [tt, tt, tt]
    print(tensorify(ttt))

if __name__ == '__main__':
    test_tensorify()
    print('Done\a')

When nested list are variable length

related for variable length:

I didn’t read those very carefully but I assume they must be padding somehow (probably need to calculate the largest length/dimension to padd is my guess and then do some sort of recursion like I did above).