Scalar product of tensors with different dimensions


I have a 2 stream conv network, one of which is a vision subnet. The output of this subnet is a tensor of dimension [b, 14,14,128] where b is the batch size. The output of the send subnet is a tensor of dim [b, 128]. I need to perform a scalar product to compute the similarity between each of the 14x14 128-dim vector from the visual subnet and the corresponding 128-dim vector from the 2nd subenet, so that the result is a [b,14,14,1] similarity map. I tried using and torch.bmm() but I am getting an error and I am not sure how I should transform the dimensions to get the 14x14x1 similarity map. I would appreciate any guidance in this regard. Here is a portion of my code:

def forward(self, x_v, x_a):
  v_out = self.vfeatures(x_v)  #[b,14,14,128] feature map
  a_out = self.afeatures(x_a) #[b,128] 

  # sim map between embeddings needs to be a [bx14x14x1] map.
  #how can I define the pairwise_sim function to compute the scalar product?
  sim_map  = pairwise_sim(v_out, a_out)

thank you.

do you mean something like this,

import torch
x = torch.randn(2, 14, 14, 128)
y = torch.randn(2, 128)
torch.einsum('abcd, ad -> abc', x, y).unsqueeze(-1).shape

torch.Size([2, 14, 14, 1])

It does produce the appropriate shape. I am not very familiar with einsum though, so I need to verify with some examples if the result matches the scalar product. Thank you very much for your help.