Batch jaccard similarity calculation

I have a batch of n 1D vectors (int32) and I wish to calculate the Jaccard similarity of the vector at index 0 with all other (n-1) vectors in one operation. Any hints on how to approach the problem? Thanks in advance

Hi Saswat!

You can use broadcasting (or expand()) to treat the first row of your
batch as a whole batch of copies of the first row. Then just perform your
desired row-by-row computation. (This will give you n results, and you can
ignore – or not compute – the first one, if so desired.) Here’s an example
where we calculate the dot product of each row in the batch with the first row:

>>> import torch
>>> torch.__version__
>>> _ = torch.manual_seed (2022)
>>> t = torch.nn.init.orthogonal_ (torch.empty (3, 5))
>>> (t * t[0]).sum (dim = 1)
tensor([ 1.0000e+00, -1.4901e-08,  0.0000e+00])


K. Frank