I have an assignment due and I am positively freaking out, I have spent 5 hours trying to load 2 classes in from the FashionMNIST dataset, but I simply cannot figure it out.
I use it to then load a batch of 8 images from the trouser/sneaker classes, which ends up working, but THEN I try applying StandardScaler() or PCA, and it will not work.
It states that ‘only one element tensors can be converted to Python scalars’ and if I try using the subset in any code, it throws attribute errors e.g. AttributeError: ‘Subset’ object has no attribute ‘numpy’
I am trying to apply PCA to only 2 classes (trouser/sneaker) from the FashionMNIST dataset.
How are you trying to apply the StandardScaler?
As described in the docs you would need to fit it first and can then apply it on numpy arrays.
Passing a Dataset will most likely not work.
but then it gives me the 'only one element tensors can be converted to Python scalars’ error so… I just assumed it was because of the way I was loading the data…?
Sorry about this - also, when I try to convert the dset_train to a numpy array, it gives me the ‘Subset’ object has no attribute ‘numpy’’ error ?
I have also tried converting the trainset to numpy array before making subset, but that did not work either. It gives me a ‘ValueError: too many values to unpack (expected 2)’ error.
StandardScaler expects a numpy array, not a torch.utils.data.Dataset as its input.
The FashionMNIST dataset stores the data as a numpy array in its internal .data attribute.
Since you want to apply scikit-learn preprocessing methods to it, I would recommend to write a custom Dataset by deriving from FashionMNIST and to fit_transform the StandardScaler in the __init__ method and transform the data in the __getitem__.
If you want to use the same scaler for the training and validation (and test) datasets, you could create the object once and pass it to the custom dataset implementations.
You could take a look at this tutorial, which shows how to write a custom Dataset. Since this task is an assignment, I’m not comfortable providing the code here.