Most efficient, differentiable Way to get differentiable pair-wise Distance

Andreas_Binder · January 8, 2023, 9:08pm

Hi all!
I look for the most efficient, differentiable way for a 3D PointCloud matrix with shape (1024,3) to find the vector containing the pairwise distances (shape: (1024x1024,1).
Currently, I use this:

x = torch.ones(1024, 3, requires_grad=True)
pdist = nn.PairwiseDistance(p=2, keepdim=True) 
out = torch.cat ([ pdist(x[n], x[i]) for n in range (len (x)) for i in range (len (x))])

which is not differentiable:

torch.autograd.grad(out.sum(), x)[0]

RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_1030593/3913672061.py in <module>
----> 1 torch.autograd.grad(out.sum(), x)[0]

~/miniconda3/envs/test_torch/lib/python3.7/site-packages/torch/autograd/__init__.py in grad(outputs, inputs, grad_outputs, retain_graph, create_graph, only_inputs, allow_unused, is_grads_batched)
    300         return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
    301             t_outputs, grad_outputs_, retain_graph, create_graph, t_inputs,
--> 302             allow_unused, accumulate_grad=False)  # Calls into the C++ engine to run the backward pass
    303 
    304 

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Thank you so much!

ptrblck · January 8, 2023, 10:20pm

Your code works for me:

x = torch.ones(10, 3, requires_grad=True)
pdist = nn.PairwiseDistance(p=2, keepdim=True) 
out = torch.cat ([ pdist(x[n], x[i]) for n in range (len (x)) for i in range (len (x))])
out.mean().backward()
print(x.grad)
# tensor([[ 9.3132e-10,  9.3132e-10,  9.3132e-10],
#         [ 9.3132e-10,  9.3132e-10,  9.3132e-10],
#         [ 9.3132e-10,  9.3132e-10,  9.3132e-10],
#         [ 9.3132e-10,  9.3132e-10,  9.3132e-10],
#         [-9.3132e-10, -9.3132e-10, -9.3132e-10],
#         [-9.3132e-10, -9.3132e-10, -9.3132e-10],
#         [-9.3132e-10, -9.3132e-10, -9.3132e-10],
#         [-9.3132e-10, -9.3132e-10, -9.3132e-10],
#         [-9.3132e-10, -9.3132e-10, -9.3132e-10],
#         [-9.3132e-10, -9.3132e-10, -9.3132e-10]])

Andreas_Binder · January 9, 2023, 3:47pm

I think I forgot the requires_grad=True line in my original code … ^^
Thank you Patrick!
As a side question, do you know if there is a more efficient (built-in) way to a arrive at this point wise distance for x?

ptrblck · January 9, 2023, 6:28pm

Yes, you could add dimensions and allow broadcasting to avoid the for loop:

x = torch.randn(10, 3, requires_grad=True)
pdist = nn.PairwiseDistance(p=2, keepdim=True) 
ref = torch.cat ([ pdist(x[n], x[i]) for n in range (len (x)) for i in range (len (x))])

out = pdist(x.unsqueeze(1), x.unsqueeze(0))
out = out.view(-1)
print((out - ref).abs().max())
# tensor(0., grad_fn=<MaxBackward1>)

Andreas_Binder · January 10, 2023, 3:34pm

Oh wow, that is amazing! Does this technique have a specific name or can I find more information on that in the PyTorch documentation? The computation time is now seconds instead of minutes

ptrblck · January 10, 2023, 7:44pm

I would refer to it as broadcasting and the ability to execute operators on batched inputs. This doc explains it a bit more and also links to the numpy docs, which has more examples.