Tricky matrix multiplication

I am computing attention weights and i want to make it vectorized.
I have the following tensors:
input tensor is [batch_size, channels, x, y], weight tensor is [channels, channels] (to match the dimensions)
for each x and y and for each image i need to multiply a vector [channels] by weight tensor.
in my terms the shapes are [64, 256, 25, 2] and [256, 256]. I am willing to see output like [64, 256, 50].
Is it possible to do so using basic operations?

Hi @Konstantin_Suloev_Jr,

The following works, but check if the reshape function makes sense for your particular use-case.


out = torch.einsum("bcx,cd->bdx",x.reshape(64,256,50), w)
print(out.shape) #returns torch.Size([64, 256, 50])

Thank you for your answer! I bet i got another solution using nn module

Ws = nn.Linear(256, 256)
result = Ws(input.flatten(2).permute([0, 2, 1]))
it seems that it makes the same thing as you suggested

If Ws has its bias set to False, then yes. If not, then they’re different operations.

They’ll be equivalent (with bias = False), but I just merged the operations into a single einsum string.