I am trying to load mnist data with a slight twist. I want to change the labels of some small percent of the training data to a different class for some experiments. Is there a way to easily do this with pytorch dataloader?

Depending on your use case you could use several different approaches:

  • manipulate the internal .targets attribute and change some labels
  • derive a custom Dataset from MNIST and change some labels in the __getitem__ on the fly
  • manipulate some labels of the current batch in the training loop
  • use a custom collate_fn to manipulate some targets (this seems unnecessarily complicated)
Thank you, changing targets is likely the solution for me. Does pytorch allow some sort of image hashing, for me to remember which images’ labels I changed even after a shuffle of the dataset?
EDIT: I can just hash the tensor of the image in the dataset, correct? And then for lookup I need to use the same tensor object from the dataset? Unless there is a way to hash based on tensor value