What is happening under the hood when calling torch.stack?

Hey all,

What is happening under the good when calling torch.stack ?

Is this equivalent to preallocating a tensor of the correct size and then to add the data one by one ?

I did some small experiment which would suggest yes (though I am not looking at memory metrics)

import torch
N = 1000


data_list = [ torch.zeros(3, 224 ,224) for _ in range(N)]
%%timeit
data_stacked = torch.stack(data_list)
>>> 79.9 ms ± 3.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

data_list = [ torch.zeros(3, 224 ,224) for _ in range(N)]
%%timeit
data_stacked = torch.empty((N, *data_list[0].shape))
for i, data in enumerate(data_list):
    data_stacked[i] = data

>>> 90.5 ms ± 12.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)