# A error between torch.cdist and manually unsqueezing tensors

As I run some code like these which do the same thing $-\sum ||x-w||$:

def via_unsqueeze(W_col, X_col):
output = -(W_col.unsqueeze(2) - X_col.unsqueeze(0)).abs().sum(1)
return output


and

def via_cdist(W, X):
output = -torch.cdist(W, X, 1)
return output


Sometimes, there will be a difference up to 2e-6 when the X, W are under the normal distribution.

Then check them via a test:

def a():
W = torch.rand(1000,300)
X = torch.rand(1000,300)
return (via_unsqueeze(W,X) - via_cdist(W,X)).abs()


The output could be like

tensor([[0.0000e+00, 3.8147e-05, 2.2888e-05,  ..., 2.2888e-05, 3.8147e-05,
1.5259e-05],
[3.0518e-05, 1.5259e-05, 0.0000e+00,  ..., 0.0000e+00, 3.0518e-05,
2.2888e-05],
[3.0518e-05, 6.8665e-05, 1.5259e-05,  ..., 7.6294e-06, 1.5259e-05,
7.6294e-06],
...,
[7.6294e-06, 1.5259e-05, 1.5259e-05,  ..., 3.0518e-05, 1.5259e-05,
3.8147e-05],
[3.0518e-05, 7.6294e-06, 2.2888e-05,  ..., 4.5776e-05, 7.6294e-06,
7.6294e-06],
[3.0518e-05, 7.6294e-06, 3.8147e-05,  ..., 1.5259e-05, 3.8147e-05,
3.0518e-05]])


So I am wondering HOW THIS HAPPENED and how to fix it?

The difference is most likely caused by the limited floating point precision and a different order of operations. You could use torch.float64 as the data type in case you need to increase the precision, but note that this data type would cause a performance hit on GPUs.
A simple example is given here:

x = torch.randn(100, 100)
sum1 = x.sum()
sum2 = x.sum(0).sum(0)
print((sum1 - sum2).abs().max())
> tensor(2.2888e-05)

x = torch.randn(100, 100, dtype=torch.float64)
sum1 = x.sum()
sum2 = x.sum(0).sum(0)
print((sum1 - sum2).abs().max())
> tensor(2.1316e-14, dtype=torch.float64)


Thank you for the answer.