What is the most efficient data structure to perform operation on 2D list of tensors?

I have a 2D list of Tensors which is of the shape (H, W) . Each item is a tensor that is of the shape (S_ij, D) , in which S_ij is a variable that depends on the row index i and column index j in the 2D list, and it has a large range. I would like to apply some neural network N_i for each row i and then distribute each column j to other devices D_j. Since S_ij varies a lot, I didn’t choose to organize my data in a large, dense tensor of the shape (H, W, max(S_ij), D) since it will lead to a waste of memory. At this moment I have to store my data in a 2D list of tensors, and concat&split tensors when applying row-wise/col-wise operations, which leads to significant overhead. Is there any other torch native data structure that can handle such a case?