Change labels in Data Loader

I have a data set of images, labels . I took a subset of it and want to change the labels of the whole subset to a single label.

eg: MNIST 0,1,2,3,4,5,6,7,8,9 ; lets say i want labels of 5,6,7,8,9 be 5. so final data labels be 0,1,2,3,4,5.

How to do it?

You could set the new value using a condition on your targets:

dataset = datasets.MNIST(
    root='PATH',
    transform=transforms.ToTensor()
)

dataset.targets[dataset.targets > 5] = 5
print(dataset.targets.unique())
> tensor([0, 1, 2, 3, 4, 5])
1 Like

AttributeError: ‘MNIST’ object has no attribute ‘targets’

In older torchvision versions, you had to use train_labels or test_labels depending if the train argument was set to True or False, respectively.

Hi, I am wondering is there a way to access the targets attributes for dataset which is imported by ImageFolder? I have a training set with 6 classes: building, forest, sea, street, glacier, and mountain. I only want to preserve the forest class label and mark the rest to unforest. I tried this:

dataset.targets[dataset.targets != 1] = 0

which didn’t work. Because it said it doesn’t have attribue targets

Maybe you are using an older version.
Could you update torchvision and check that attribute again?
Also note that dataset.targets is a Python list in ImageFolder, so this indexing won’t work and you should cast it to a tensor before:

dataset = datasets.ImageFolder(root='PATH')
dataset.targets = torch.tensor(dataset.targets)
dataset.targets[dataset.targets==0] = 1
1 Like

Thanks for replying to me. I used conda update torchvision and my torchvision version is 0.2.1 now. It’s still not working. Is that the latest?

Could be. I’m usually just install torchvision from source, as it is really easy and gives you all the new features.
You would have to clone the repo and just run python setup.py install as described here.

Thanks! I installed from source and it’s working now! Any idea why the pip and conda distributions don’t have 0.2.3 right now?

Hi,I have a follow-up question on that. I successfully convert the target’s attributes. But when I load the data with dataloader it still preserves the original targets.

train_set_two = datasets.ImageFolder(train_dir, transform=transform)
train_set_two.targets = torch.tensor(train_set_two.targets)
test_set_two.targets = torch.tensor(test_set_two.targets)
train_set_two.targets[train_set_two.targets > 1] = 1
test_set_two.targets[test_set_two.targets > 1] = 1

# prepare data loaders (combine dataset and sampler)
train_loader = torch.utils.data.DataLoader(train_set_two, batch_size=batch_size,
    sampler=train_sampler, num_workers=num_workers)

After this chunk of codes, I run visualized my train_loader, which still has 6 classes instead of 2. Any idea?

Oh, right. Internally, ImageFolder seems to call dataset.samples, so you could try the following code:

train_set_two.samples = [(d, 1) if s > 1 else (d, s) for d, s in train_set_two.samples]
2 Likes

Hi. I have a customized dataset. the extension is .pth. How do I select the labels from the dataset?

How did you store the dataset and what does the pth file contain?

Hi, this was really useful to me changing the labels in EMNIST from 1-26 to 0-25.

However, do you have a link to .targets in the pytotch documentation? The word ‘target’ appears so much there that I can’t search for it successfully.

I assume you are referring to the first code snippet using the MNIST dataset?
If so, you can find the targets definition here.

Let me know, if I misunderstood the question.

1 Like