I have a batch of n 1D vectors (int32) and I wish to calculate the Jaccard similarity of the vector at index 0 with all other (n-1) vectors in one operation. Any hints on how to approach the problem? Thanks in advance
Hi Saswat!
You can use broadcasting (or expand()
) to treat the first row of your
batch as a whole batch of copies of the first row. Then just perform your
desired row-by-row computation. (This will give you n
results, and you can
ignore – or not compute – the first one, if so desired.) Here’s an example
where we calculate the dot product of each row in the batch with the first row:
>>> import torch
>>> torch.__version__
'1.11.0'
>>> _ = torch.manual_seed (2022)
>>> t = torch.nn.init.orthogonal_ (torch.empty (3, 5))
>>> (t * t[0]).sum (dim = 1)
tensor([ 1.0000e+00, -1.4901e-08, 0.0000e+00])
Best.
K. Frank