# Most efficient way to perform multiplication of list of tensors?

Hello everyone,

I am wondering what is the most efficient way to perform multiplication of tensors contained in a sublist within two lists.

Let us say that I have a list `A` containing sublists `a1,...ak` containing tensors of different sizes within each list, e.g. `a1 = [a11, a12, a13, a14]` where

``````a11.size() == torch.Size([128, 274])
a12.size() == torch.Size([1, 128])
a13.size() == torch.Size([256, 128])
a14.size() == torch.Size([1, 256])
``````

but note that each list within `A` contains tensors that have the same sizes as those in `a1`.

Now consider a quite similar list, `B = [b1,..., bn]`, where each sublist `bi` contains the same number of tensors as those in the sublist of `A`. Moreover, the inner tensors have the same shapes as those in each sublist `aj`.

I am looking to â€śmultiplyâ€ť each sublist in `A` with each sublist in `B` such that I get a list `C` which contains `k` lists, each containing the matrix multplication of the sublists `aj` with each sublist in `B`. Formally,

``````C = [ [a1 * b1, a1 * b2, ..., a1 * bn], [a2 * b1, a2 * b2, ..., a2 * bn], ..., ]
``````

where `a1 * b1` denotes `[a11 * b11, a12 * b12, a13 * b13, a14 * b14]`.

However, I cannot perform a multiplication between list of tensors efficiently, for instance trying `a1 * b1` outputs the following error

``````TypeError: can't multiply sequence by non-int of type 'list'
``````

Thus my guess is that the format in which I am storing my tensors is probably not optimal.

Is the best way to perform such computation the following

``````c =  [ [torch.mul(aik, bjk)) for aik, bjk in zip(ai, bj)] for ai, bj in zip(A, B)]
``````

or is there a more efficient way to do so ? (e.g. storing tensors in other than a list).

it is best to have multiplicands in a single memory block (buffer), but this is often impossible to arrange, as copying is not efficient, and â€śoutâ€ť argument breaks gradients. if you mean matrix multiplication, you canâ€™t â€śpackâ€ť tensors with varying shapes either.

apart from that, you can only reduce invocation overhead a bit (e.g. with JIT), but this will likely be insignificant, unless youâ€™re dealing with hundreds of small tensors.

I believe I have found the answer to my question. Since, tensors can only be concatenated when of identical size, I can simply concatenate each tensors of identical sizes in each of the lists in `A` and `B` to make 3D tensors.

For instance,

``````A = [torch.stack(x) for x in list(zip(*A))]
B = [torch.stack(x) for x in list(zip(*B))]
``````

outputs two lists containing `4` 3D tensors each having respectively `k` and `n` as first dimension. To stay coherent with the post, the first element in `A` has now the following size

``````A[0].size() == torch.Size([k, 128, 274])
``````

Then flattening each of the tensors in dimension 1 and 2 + transposition allows standard faster matrix multiplication.

``````C = [(torch.mm(torch.flatten(a, start_dim=1),
torch.flatten(b, start_dim=1).transpose(0,1)), dim=1)
for a, b in zip(A, B)]
``````

This allowed my code to run 6x faster.

1 Like