# Multiply and accumulate two tensors across batch size

I have two tensors with below size

x = torch.Size([10, 3, 128]) → [batch_size, no_of_IDs, no_events_per_ID]
attention_value = torch.Size([10, 3, 1]) → [batch_size, no_of_IDs, att_value]

How to mac (multiply and accumulate) these 3D tensors across batch_size i.e. “x with attention_value”

When tried: torch.matmul(x, att_value) giving below error

RuntimeError: Expected batch2_sizes == bs && batch2_sizes == contraction_size to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)

Output size should be [10, 128]

Based on the tensor shapes you give, I assume that you want to multiply
each matrix in the batch `x` by the corresponding matrix in the batch
`attention_value` (matrix-multiply, not element wise).

I assume that you are not somehow summing across elements of the
batches, so I’m not sure what you mean by “accumulate.”

You can use `bmm()` (“batch matrix multiply”) after using `transpose()`
to line the dimensions up correctly (and then use `squeeze()` to get rid
of the singleton dimension):

``````>>> import torch
>>> torch.__version__
'1.10.2'
>>> x = torch.randn (10, 3, 5)   # using 5 instead of 128
>>> attention_value = torch.randn (10, 3, 1)
>>> torch.bmm (x.transpose (1, 2), attention_value).squeeze()
tensor([[ 2.3298,  0.5740,  2.4056,  1.2855, -0.7073],
[-0.0287, -4.0949, -0.5672,  1.7701,  1.9519],
[ 0.6117,  0.1664, -1.3646, -1.6738, -1.7130],
[ 0.6985, -0.3951, -0.5439, -0.5091, -0.2386],
[ 2.7237, -0.8359, -0.4583,  1.1985,  3.0083],
[ 1.1887, -0.4801,  1.3998, -1.3990,  2.6126],
[ 1.2119,  0.2461,  1.8558, -0.8090,  2.0118],
[-0.2383,  1.4226, -0.2152,  0.9591,  1.5240],
[-0.4384,  0.0184, -0.9334,  0.0796, -0.6989],
[ 0.7386, -0.6474,  0.9869, -0.5547, -0.0543]])
``````

Best.

K. Frank

Thank you Frank. Was able to solve the logic with below steps…

Perfoming permute on x[10, 3, 128]

x = torch.permute(x, (0, 2, 1)) => x[10, 128, 3] and att_value[10, 3, 1]

And perform
context_vector = torch.matmual(x, att_value) → [10, 128, 1] followed by
torch.squeeze(context_veactor, 2) → [10, 128]

Will try your method as well.
Thank you for your time and details.