How to deal with autoencoder (or other) encodings for similarity inference and clustering

I trained an autoencoder that accepts an image and produces an encoding from the encoder. The encoding is 64x64x64 so flattening it into a row vector of size 262144.

I have 5000 images, and given a test image( which goes through the encoder) , I need to find n similar encodings and corresponding images, and maybe later cluster the whole dataset.

one GitHub repo owner seems to have done it by concatenating the image encoding into a 5000,262144 matrix, and then running a knn on it with the test encoding.

I can’t do this on Colab cuz the instance crashes when RAM fills up.

Tried to use a regular for loop to convert each picture and save it as a .npy, but that was extremely painful since Colab just freezes when the loop is more than 500 big.

Even if I do have the .npy files, not sure how to classify, or cluster since I can’t put the whole thing on the RAM.