Standard deviation

Hi, i want to calculate the standard deviation for this distance:
de+=distance.euclidean(output,input)

That´s what i tried:
print(de.std(dim=1))

Then i get the following error: AttributeError: ‘float’ object has no attribute ‘std’
What am i doing wrong here?

Hi,

I think based on your error, euclidean distance will return a float number for any two nd matrix. So de+= is always a float number. std cannot be computed for a single float number.

Could you please provide what library you are using for calculating euclidean distance?

But typically, you can calculate it using torch.dist(output, input, 2). Here is the documentation.

Bests

Wrong question, sorry. How can i calculate the mean of de?

       ne=ne.detach().cpu().numpy()           
       naa=naa.detach().cpu().numpy()           
       de=0
       for ne, naa in zip(ne, naa):          
         de+=distance.euclidean(naa,ne)

I think torch.dist(ne, naa, 2) will do the trick. No need to convert it to numpy or transfer it to CPU.

I need to do what i did, for other things in the program. What i want, is just to calculate the mean of the values given by variable: de
I know that it is very easy, but can you explain me, how can i do that?

Could you please print the shape of ne and naa?

Of course. Both shape: (784,)

With this shape of inputs, de only can be a single float number. Mean of a single number is itself.
By below code

You are adding 768 distances which will result in a single number. Are you looking for having a list of 768 separate distances then calculating mean for them?

This loop is always printing the distance values, and then end, when finished iterating all data.

      ne=ne.detach().cpu().numpy()           
      naa=naa.detach().cpu().numpy()           
      de=0
      for ne, naa in zip(ne, naa):          
        de+=distance.euclidean(naa,ne)
      print(de/size_batch)

I want to calculate the mean of all printed values, it is clear now what i want to do?
The print, prints values like: 3.21412
3.124124
3.112412
3.42121
And i want to calculate the mean of all this printed values.

1 Like

In this case you just doing fine. Just use different variables for looping:

ne = np.randn(768,)
naa = np.randn(768,)

de=np.zeros((1, ))
for ne_, naa_ in zip(ne, naa):          
    de+=np.sqrt((ne_ - naa_)**2)
de/ne.shape[0]

But using loops are not efficient. The more optimized way to this using numpy:

np.sum(np.sqrt((ne - naa)**2)) / ne.shape[0]

Sklearn:

from sklearn.metrics.pairwise import euclidean_distances
np.sum(np.diag(euclidean_distances(ne.reshape(-1, 1), naa.reshape(-1, 1))))/ne.shape[0]

Sorry that I did not understand your question very well.

Bests

1 Like

Thank you, for the detailed explaination you helped me a lot. Just for curiosity, in my previous loop, how can i calculate the mean?