How to optimally implement pairwise custom kernel operation?


Given 1D differentiable vectors A=[Nx1] and B=[Mx1], I am looking to compute pairwise kernel operation .

def kernel(a, b):

Is there a way to avoid looping over individual items? I need to perform the kernel operation for pairwise entries in {A, A}, {A, B} and {B, B} which might make it computationally heavy if done iteratively.

You could try to use broadcasting as seen here:

a = torch.arange(4).float().view(4, 1)
b = torch.arange(4).float().view(4, 1)

# element-wise
print(a - b)

# pair-wise
print(a.unsqueeze(1) - b)

which would result in a higher memory footprint, but might be faster than your sequential approach.