K means implementation with Pytoch

alex_gilabert · June 22, 2020, 11:09am

Hello.

I am trying to implement a k-means algorithm for a CNN that first of all calculate the centroids of the k-means.

I have a tensor of dims [80,1000] with the features of one layer of the CNN.
Then i randomly create a tensor of the same dims.

I calculate the euclidean dist. and take the minimum of this tensor. Then, i assign the new centroids corresponding to the index of the tensor with the features.

My problem is that i want to loop this until the centroids dont change their position but i am not very sure how i could achieve so and how to go updating the values received for the centroids.

the code is the following:

def k_means_torch(dictionary, model):
    centroids = torch.randn(len(dictionary), 1000).cuda()
    dist_centroids = torch.cdist(dictionary,centroids, p=2.0)
    (values, indices) = torch.min(dist_centroids, dim=1)
    centroids_new = dictionary[indices]
    while 1:
      dist_centroids_loop = torch.cdist(dictionary,centroids_new, p=2.0)
      (values_, indices_) = torch.min(dist_centroids_loop, dim=1)

      if (torch.equal(centroids_new, dictionary[indices_])):
        break
      else:
        centroids_new = ((centroids_new + dictionary[indices_])/2)

Any ideas are more than welcome

alex_gilabert · June 23, 2020, 2:34pm

This code would resolve the precission problems.

a = torch.all(torch.lt(torch.abs(torch.add(centroids_new, -new_centers)), 1e-5))

This gets rid of the last digit of 1.0000000001 as we are taking into account Float values.