Swapping labels

rjsdebug · June 13, 2022, 3:04pm

Hello,
I am trying to load mnist data with a slight twist. I want to change the labels of some small percent of the training data to a different class for some experiments. Is there a way to easily do this with pytorch dataloader?

ptrblck · June 14, 2022, 6:01am

Depending on your use case you could use several different approaches:

manipulate the internal .targets attribute and change some labels
derive a custom Dataset from MNIST and change some labels in the __getitem__ on the fly
manipulate some labels of the current batch in the training loop
use a custom collate_fn to manipulate some targets (this seems unnecessarily complicated)

rjsdebug · June 14, 2022, 2:10pm

Thank you, changing targets is likely the solution for me. Does pytorch allow some sort of image hashing, for me to remember which images’ labels I changed even after a shuffle of the dataset?
EDIT: I can just hash the tensor of the image in the dataset, correct? And then for lookup I need to use the same tensor object from the dataset? Unless there is a way to hash based on tensor value