Convert int into one-hot format

Hi Ptrblck,

my labels are between o to 100 , I use this code t map them on the 100 dimention. Real_volume are 64 labels which are float. do you think it is right?

   bb=np.arange(0,100)
    onehot = torch.zeros(100, 100)
    onehot = onehot.scatter_(1, torch.LongTensor(bb.tolist()).view(100,1), 1).view(100, 100, 1,1)
    Fakelabel=onehot[Real_volume].to(device)

if I want to map them on 10 dimension how it can be possible?I did some changes but gave me error like this:
bb=np.arange(0,100)
onehot = torch.zeros(100, 10)
onehot = onehot.scatter_(1, torch.LongTensor(bb.tolist()).view(100,1), 1).view(100, 10, 1,1)
Fakelabel=onehot[Real_volume].to(device)

The first code looks good and will map the bb labels to dim1 in onehot.
You won’t be able to map 100 labels to a dimension of size 10, as per definition, 100 different labels would be mapped to a one-hot encoded dimension of the same shape.

sorry, in bb, if we define more labels does it make problem? for example in each iteration I have 64 labels which can be any number between 2 to 89 , but to be more sure I define bb between 0 to 99? Or bb should be exact number of labels?

Try to map the labels to the range [0, nb_classes-1] instead of random ranges such as 64 values between 2 and 89. The latter approach would assume that you are dealing with 90 classes, instead of 64, and might make problems.

But this one seems can’t specify the dimension. Say a tensor is [128, 20, 1], the total class is 10, I want to get [128, 20, 10]. It would be nice if we could specify the number of classes and the dimension.

Can someone please provide a clear example of how torch.nn.functional.one_hot can be used in instances when we have a batch of tensors? I.e. our tensor looks like this (N, W, H), where N is the batch size.

Try

import torch.nn.functional as F
batch = F.one_hot(batch.long(), num_classes=n_classes)
batch = batch.permute(0, 3, 1, 2).float()

It gives you a one hot vector of shape (batch_size, num_classes, W, H)

1 Like

Hmm thanks for the suggestion but that doesn’t produce desired results. I get an error instead: RuntimeError: Class values must be smaller than num_classes.

Any thoughts on what could be wrong? My n_classes is 1.

Thanks!

This error is raised, if the num_classes argument in F.one_hot is smaller that the number of different classes in the input tensor.
Here is a small example:

x = torch.arange(5)
print(x)
> tensor([0, 1, 2, 3, 4])

y = F.one_hot(x, num_classes=5)
print(y)
> tensor([[1, 0, 0, 0, 0],
          [0, 1, 0, 0, 0],
          [0, 0, 1, 0, 0],
          [0, 0, 0, 1, 0],
          [0, 0, 0, 0, 1]])

F.one_hot(x, num_classes=4) # error
> RuntimeError: Class values must be smaller than num_classes.
1 Like

Thanks @ptrblck. I understand the problem but what should we be passing to F.one_hot when our number of classes in the input tensor is 1? F.one_hot(x, num_classes=0) doesn’t work :frowning:

If you are only dealing with a single class, you could use this:

x = torch.zeros(5).long()
print(x)
> tensor([0, 0, 0, 0, 0])

y = F.one_hot(x, num_classes=1)
print(y)
> tensor([[1],
          [1],
          [1],
          [1],
          [1]])

Note, that I don’t think your use case is well defined. If you are dealing only with a single number of classes (class0), your model won’t be able to learn anything, as the only possible and right answer would be to predict a high probability for class0.
A simple methods such as:

def classify(input):
    return 0

would yield a 100% accuracy.

Very nice answer. It helps. This is the most elegant answer.

@ptrblck I think your statement above is incorrect (" This error is raised, if the num_classes argument in F.one_hot is smaller that the number of different classes in the input tensor"). The error seems to be raised if a coefficient value is bigger than the expected class count, which IMO doesn’t make sense.
Eg in segmentation, coefficients are pixel values (1-255) and models often have less than 255 classes

for example this fails with error “RuntimeError: Class values must be smaller than num_classes.”

x =  tensor([1, 50, 50, 2 1])

y = F.one_hot(x, num_classes=3)

?

In PyTorch, how to 1-hot an arbitrary tensor that has N distinct coefficients, some coefficients possibly bigger than the class count? (my use-case is to 1-hot a 1-255 grayscale tensor with 55 shades of grey into a 1-hot tensor [55, h, w] to feed into a DeepLab model)

Your use case fits exactly in my statement, i.e. the num_classe argument is set to a smaller values then the class indices in the input tensor. Given that x contains a max. class index of 50 you would be dealing with at least 51 classes ([0, 50]) and would thus need to use:

x =  torch.tensor([1, 50, 50, 2, 1])
y = F.one_hot(x, num_classes=51)

If you want to use 3 classes only (the unique values in x), map them to [0, 2].

I see thanks!
what is the most PyTorchic way to do that conversion/normalization from a Tensor with N arbitrary unique values to a Tensor with N values in 0, N-1 ?

I would probably use the inverse indices of torch.unique:

x =  torch.tensor([1, 50, 50, 2, 1])
u, idx = x.unique(return_inverse=True)
print(idx)
> tensor([0, 2, 2, 1, 0])

If you have a particular order in mind, you could use a manual mapping instead.

1 Like

lovely thanks a lot will test that