Hi guys,
I’m loading my image dataset (3 classes) with datasets.ImageFolder(), unfortunately the labels are then integers. So for the first class the label is 0, for the second it is 1 and for the last it is 2. I’d like the labels to be one hot encoded tho. Is there a way I can do this with the target_transform property when loading the data?
I’ve tried nn.functinal.one_hot() for example but I always need an input for this functions but I don’t know what to use as an input.
I also tried to use torch.eye(3)[i] where i is the label (0, 1 or 2) but again, i need the target as an input.
Is there a way to do this?
Thank you
My provisional fix:
example training loop:
for epoch in range(EPOCHS):
for i, (images, labels) in tqdm(enumerate(train_dl)):
images = images.to(device)
labels = torch.eye(3)[labels].to(device)
preds = model(images)
loss = loss_func(preds, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
During every iteration, when pushing the labels to the device (‘cuda’ or ‘cpu’ etc.), a one hot vector is created.
labels = torch.eye(3)[labels].to(device)
torch.eye(3)
'''output:
[[1, 0, 0]
[0, 1, 0]
[0, 0, 1]]
'''
torch.eye(3)[0]
'''output:
[1, 0, 0]
'''
So when the label is 0 the one hot vector is [1, 0, 0], for 1 it is [0, 1, 0] and so on.
Still, I’d like a fix using the target_transform attribute tho!
According to the docs target_transform should be callable. Assuming you know the number of classes in advance, something like this should work:
def target_to_oh(target: int) -> List[int]:
NUM_CLASS = 5 # hard code here, can do partial
one_hot = [0] * NUM_CLASS
one_hot[target] = 1
return one_hot
# Use here
ds = ImageFolder('some/path', target_transform=target_to_oh)
I didn’t try it. May have some errors but conceptually should work
1 Like
Whether u encode them one-hot style (pun intended) or not it’ll still give u the same or similar results.
Still let me tell u what I usually do:
I created a small block of code that iterates through my dataset and assign each label to them as one hot encoded np.eye(3)[Idx]
given the directory it belonged to.
Then I saved the entire dataset as a .npy file, then after I load it I convert the numpy instances to tensors (u can also make them tensors from start tho)
I would have posted the sample code here, but I’m currently out and not with my PC at the moment so u’ll have to wait till evening (about 5 hrs from now).
1 Like
Thank you, @Alexey_Demyanchuk
I got an error when trying your code, but then I changed it a bit and it worked like that:
def target_to_oh(target):
NUM_CLASS = 3 # hard code here, can do partial
one_hot = torch.eye(NUM_CLASS)[target]
return one_hot
1 Like