How to get manhattan distance matrix between two set of vectors?

Hi, I want to know if there is a packed function in PyTorch to calculate the Manhattan distance between vectors. For your information, the Manhattan distance between vector a and vector b is calculated as:

distance = sum(abs(a-b))

Now I have a large set of vectors A in the shape of (5000, 100), and a large set B in the shape of (150000, 100), I want to get the distance matrix that is in the shape of (5000, 150000). Each element [i][j] in the distance matrix is the manhattan distance between A[i] and B[j].

I’ve tried a naive method as below, but it’s unbearably slow.

for i in range(...):
    for j in range(...):
        matrix[i][j] = sum(abs(A[i]-B[j]))

I want to know if there is a much quicker way to calculate this, such as an optimized packed function? Thank you!

Hi, I think I figured out the solution. I can use torch.cdist(x1 , x2, p=2.0) function which computes batched the p-norm distance between x1 and x2. What I need to do is just turn into x1 and x2 batched before calling this function.

output = torch.cdist(x1, x2, p=1)

Here 1-norm distance is Manhattan distance.

Hi @ Audrey,

You might also consider looking at the built in torch operation “L1Loss”, which can be set up as averaged or not. The average takes place over all elements. Not sure if this will serve your purpose. Please see the following:

Sincerely, mb1996

Thanks @mb1996 ! Just woke up and saw your reply.

I tried your idea about “L1Loss” below, but the input and target in its parameters are required to be the same shape, which unfortunately doesn’t work in my situation.

loss = nn.L1Loss()
input = torch.randn(2, 2, requires_grad=False)
target = torch.randn(3, 2)
output = loss(input, target)

The error I got:

RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0

Anyway, thanks for your reply.

Hi @Audrey,

No problem. That is true. I didn’t think of that initially but you are right, that dimension has to match. I hope that the solution you found works out to be fast enough.

Sincerely, mb1996