Basically I would like to do what this stackoverflow thread asks. But the answer does not work for me. All I want to do is copy the FashionMNIST dataset.
Here is a minimal example to show what I mean:
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
dat1 = datasets.FashionMNIST('./', train=True, download=True, transform=transform)
dat2 = datasets.FashionMNIST('./', train=True, download=True, transform=transform)
dat1.classes = 'whacky name 3000'
['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
['whacky name 3000', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
I would expect it to give me the original list and not a changed list. I only changed dat1 and not dat2.
Even if I try using deepcopy it doesn’t work. Can someone explain the inner workings and a solution?
I can explain the reason that the
classes is attached to Dataset
class rather than Dataset instance. See: vision/mnist.py at 6ca9c76adb6daf2695d603ad623a9cf1c4f4806f · pytorch/vision · GitHub
In terms of solution, it depends on what would be your use case. Do you want to create multiple non-overlapping subsets from a Dataset or you want to have duplicate data from a Dataset?
Thank you for the link.
I would like to change the labels for an experiment I am doing. So in each run I change the class names and labels. Ideally, I load the dataset change the labels and the class names train the model and repeat several times.
But I change the labes and classes and then when I want to reload/ revert. It goes haywire because it copies the old dataset to the new dataset. I cannot reload or copy a temp dataset.
Another solution I tried, was to just make a copy of the information I needed from the dataset.
cls_copy = copy.deepcopy(dat1.classes)
targets_copy = copy.deepcopy(dat1.targets)
and just copy that Info back into the dataset in each run to ‘reset’ it. This also breaks. There are dependencies I don’t quite understand.
I would still like to know why I cannot copy a dataset without keeping the reference to the old dataset or reload a clean copy.
reset, could you please try to do
type(dat1).classes = cls_copy?