Batched tensordot?

I would like to do a “batched” tensordot. For instance say I have ten 3d vectors packed into a tensor x with x.size() = (10,3).
And I have ten 2x3 matrices packed into a tensor A with A.size() = (10, 2, 3).
I would like to multiply the ith matrix with the ith vector and get the results packed into a tensor y with y.size() = (10, 2).
How to do this?

One way to do this is using torch.einsum:

import torch
x = torch.randn(2,3)
A = torch.randn(2,2,3)
res = torch.einsum('abc,ac -> ab', A, x)

assert torch.allclose(torch.matmul(A[0], x[0]), res[0])
assert torch.allclose(torch.matmul(A[1], x[1]), res[1])