I currently have a tensor of size ([10, 5]) which represents 10 5-D vectors and a tensor of size ([10, 5, 5]) which represents 10 5x5 matrices. I would like to perform

V_i @ M_i @ V_i.T for each matrix and vector so it returns back a ([10, 1]) tensor and I think this could be done via einsum. Currently I can do this easily via a for loop but it is very slow and I am wondering if this calculation is possible to do using einsum or some other torch function?

Hi @armal,

Here are some example codes that solve your problem.

```
import torch
_ = torch.manual_seed(0)
V = torch.randn(10,5)
M = torch.randn(10,5,5)
def einsum_op(V, M):
out = torch.einsum("bi,bij,bj->b", V, M, V)
return out
def unsqueeze_op(V, M):
out = V.unsqueeze(-2) @ M @ V.unsqueeze(-1)
return out.squeeze() #reshape to size [B,]
out = einsum_op(V,M)
print(out)
#returns
#tensor([ 4.7036, 6.2151, 1.1999, -2.0412, -6.3517, -5.3721, -2.5428,
# -0.4131, -24.2325, -2.3242])
out = unsqueeze_op(V, M)
print(out)
#returns
#tensor([ 4.7036, 6.2151, 1.1999, -2.0412, -6.3517, -5.3721, -2.5428,
# -0.4131, -24.2325, -2.3242])
```

If you want to run a function over batches of inputs, you might want to have a look at the functorch library (which is specifically built for this). Documentation here.

Thank you so much for this! I will definitely take a look at functorch as well, looks like exactly what I need.