Apply KMeans on MNIST dataset


I’m trying to apply KMeans clustering on MNIST data set.

Please see my code below:

import torch
from torchvision import transforms
import torchvision.datasets as datasets
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

[1] _tasks = transforms.Compose([transforms.ToTensor()])
[2] mnist_trainset = datasets.MNIST(root=’./data’, train=True, download=True, transform=_tasks)
[3] kmeans = KMeans(n_clusters=10)
[4] KMmodel =

I receive the following error:

Can someone help please???


mnist_trainset is a Dataset, while sklearn works on plain bumpy arrays.
You could get the underlying data as a bumpy array via:

data =

and normalize it manually.