I have 2 tensors of the following dimensions:
A: n x i x o
B: n x b x i
and I would like to compute the tensor C
of dimension n x b x o
. Here, n
denotes the number of feature maps, o
is the output dimension, i
is the input dimension, and b
is the batch size.
Think of A, B, C
as stacks of matrices. The operation I’m looking for is essentially map-wise matrix multiplies.
What would be the most GPU-efficient way to express my computations.
Would
C = torch.einsum('nio,nbi->nbo', [A, B])
do the trick?
Would that be correct and reasonably efficient? If not, what’s a better alternative?
Note that I can change the orders of the dimensions to make the computation more efficient if necessary.