Apply KMeans on MNIST dataset

Hello,

I’m trying to apply KMeans clustering on MNIST data set.

Please see my code below:


import torch
from torchvision import transforms
import torchvision.datasets as datasets
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

[1] _tasks = transforms.Compose([transforms.ToTensor()])
[2] mnist_trainset = datasets.MNIST(root=’./data’, train=True, download=True, transform=_tasks)
[3] kmeans = KMeans(n_clusters=10)
[4] KMmodel = kmeans.fit(mnist_trainset)

I receive the following error:

Can someone help please???

THANKS

mnist_trainset is a Dataset, while sklearn works on plain bumpy arrays.
You could get the underlying data as a bumpy array via:

data = mnist_trainset.data.numpy()

and normalize it manually.