K-Means for Tensors

shivangi · June 4, 2018, 9:35am

Is there some clean way to do K-Means clustering on Tensor data without converting it to numpy array.

I have a list of tensors and their corresponding labes and this is what I am doing.

def evaluateKMeansRaw(data, true_labels, n_clusters):

kmeans = KMeans(n_clusters=n_clusters,n_init=20)
kmeans.fit(data)

acc = cluster_acc(true_labels, kmeans.labels_)
nmi = metrics.normalized_mutual_info_score(true_labels, kmeans.labels_)
return acc, nmi

But this doesn’t work on the output of a Conv Layer

justusschock · June 4, 2018, 10:04am

You cannot use scikit-learn on tensors as scikit-learn (and all of its methods) only work on numpy arrays.

However if you are not afraid to use custom implementations you could use something like this

shivangi · June 4, 2018, 10:18am

Do we not have these algorithms as part of the framework as of now ?

subhadarshi · February 4, 2020, 11:35pm

check out this github repo. can be installed in a breeze using pip:

pip install kmeans-pytorch

find documentation here

JosueCom · July 4, 2021, 4:30pm

I implemented NN, KNN and KMeans on a project I am working on only using PyTorch. You can find the implementation here with an example: Nearest Neighbor, K Nearest Neighbor and K Means (NN, KNN, KMeans) only using PyTorch · GitHub

>>> import torch as th
>>> from clustering import KNN
>>> data = th.Tensor([[1, 1], [0.88, 0.90], [-1, -1], [-1, -0.88]])
>>> labels = th.LongTensor([3, 3, 5, 5])
>>> test = th.Tensor([[-0.5, -0.5], [0.88, 0.88]])
>>> knn = KNN(data, labels)
>>> knn(test)
tensor([5, 3])

JosueCom · October 19, 2023, 6:49pm

This is an updated link