# Cosine Distance in Pytorch

Hi,

I want to define a function that will calculate the cosine distance between two normalized vectors v1 and v2 is defined as 1 - dot_product(v1, v2) in pytorch.

``````def     cosine_distance (f_x, f_y):
f_xx_normalized  = F.normalize(f_x, p=2, dim=1)
f_yy_normalized  = F.normalize(f_y, p=2, dim=1)
f_yy_normalized_transpose = f_yy_normalized.transpose(0,1)
cosine_loss = 1 - torch.sum(torch.mm(f_xx_normalized, f_yy_normalized_transpose ))
return cosine_loss
``````

The expected loss would be 0 < cosine_loss <1, but I got the loss approximately -90.78. Is it working ? I am confused.

Hi,

your inputs appear to be batches of vectors (let’s say of shape b x n).
The result of `torch.mm(f_xx_normalized, f_yy_normalized_transpose )` is a b x b matrix containing the cosine of every vector in f_x with every vector in f_y, while you would likely only be interested in the diagonal. Maybe it’s easiest to express the diagonal as `torch.einsum('bn,bn->b', f_xx_normalized, f_yy_normalized)`, but you could also unsqueeze a singleton dimension and use `torch.matmul` for a batched matrix multiplication.

You then have a vector of length b (instead of the matrix) with a cosine of the angles, in articular values between -1 and 1. If you want to keep the structure, using `torch.mean` in place of `torch.sum` would make the result in [-1, 1], and then the `cosine_loss = 1-torch.mean(torch.einsum('bn,bn->b', f_xx_normalized, f_yy_normalized))` is between 0 and 2.

I might add that it is inefficient to normalize the vectors just to take the scalar product, it would probably be better to divide by the norms.

Best regards

Thomas

Hi Thomas,