Convert int into one-hot format


Hi all.

I’m trying to convert the y labels in mnist data into one-hot format.

Since I’m not quite familiar with PyTorch yet, for each iteration, I just convert the y to numpy format and reshape it into one-hot and then convert it back to PyTorch. Like that

for batch_idx, (x, y) in enumerate(train_loader):
    y_onehot = y.numpy()
    y_onehot = (np.arange(num_labels) == y_onehot[:,None]).astype(np.float32)
    y_onehot = torch.from_numpy(y_onehot)

However, I notice that the it gets slower each iteration, and I doubt it’s these code which might request new memory each iteration that makes the code slower.

So my question is, is there a more PyTorch way, which may help me avoid such conversion?


(Moskomule) #2

HI, it depends on your loss function, but some PyTorch’s loss functions take class labels as their targets(e.g. NLLloss). So if you use them, you don’t need to convert targets into onehot vectors.


Thanks for advice, I have seen such solution in the examples branch. However, I use such y as input, so that can’t solve my case.

(Alban D) #4


You can use the scatter_ method to achieve this.
I would also advise to create the y_onehot tensor once and then just fill it:

import torch

batch_size = 5
nb_digits = 10
# Dummy input that HAS to be 2D for the scatter (you can use view(-1,1) if needed)
y = torch.LongTensor(batch_size,1).random_() % nb_digits
# One hot encoding buffer that you create out of the loop and just keep reusing
y_onehot = torch.FloatTensor(batch_size, nb_digits)

# In your for loop
y_onehot.scatter_(1, y, 1)


How to get the partial derivative of the output data with the middle layer data

Thanks, that is exactly what I need!

(Nadav) #6

Isn’t there a more efficient way to input a “sparse Tensor” or a vector of indices into the network (specifically RNNs)?
I guess something similar to torch’s sparse linear (only for RNNs).


Yes, I was using one_hot_encoding layer in Tensorflow, and it seems that there is no equivalent choice in PyTorch contemporarily.

(Adam Paszke) #8

@Nadav_Bhonker we’re working on adding more and more support for sparse operations, but the our fastest RNN backend (i.e. cuDNN) doesn’t support sparse inputs anyway. I’d recommend using Embedding for that.

(Nick) #9

just a note (from my understanding… maybe it doesnt apply in this case) it is currently advised to NOT follow this approach of creating the variable once and filling it each time (see How to use Batch normalization in testing model)

And also for future readers just to reiterate what user moskomule says- cross entropy and neg. log-likelihood losses in pytorch do NOT require one-hot encodings, so you can just use the normal target vector.

(Adam Paszke) #10

@ncullen93 the whole thread was about converting tensors I think, so it doesn’t apply there. But both of your statements are correct and are should be followed. Thanks :slight_smile:

(Rajarshee Mitra) #11

How about overriding the default nn.Embedding weights data with torch.eye ?

emb = nn.Embedding(10, 10) = torch.eye(10)

Done! Now, pass your batch containing indices to it.
emb(Variable(torch.LongTensor([[1, 2], [3, 4]])))
will give output as:
Variable containing:
(0 ,.,.) =
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0

(1 ,.,.) =
0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0

(Adam Paszke) #12

@rajarsheem that’s not a very good idea if your vector dimensionality is large. You’ll end up storing a huge weight matrix in memory, and in your code emb.weight requires gradient and it might be subject to optimization if you don’t take care.

Additionaly, zero + scatter a few ones will be much faster than copying whole rows, of which most values are 0 anyway.

(Roger) #13

Hi @albanD adn @apaszke , I was trying to use the scatter function, but I am running into some troubles.
in my case I have something like this:

y = torch.LongTensor(batch_size,5,5).random_() % 3#3 classes,5x5 img
y_onehot = torch.FloatTensor(batch_size,3, 5,5)#I want the one hot going through the chans dim


However, it gives me the following error
Index tensor must have same dimensions as output tensor at /data/users/soumith/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:450

Could you help me with this? Thanks!

(Roger) #14

oh never mind, I just found that it works if I add a singleton dimension so that y and y_onehot have the same NUMBER of dimensions…

(曾宏伟) #15

It seems that the y can’t be a Variable or a cuda.FloatTensor.
How do I solve this TypeError?

(Alban D) #16

There is no reason for y to be a Variable here.
And since y is used to index in a tensor, it needs to have a proper indexing type: LongTensor.
So if you use cuda, y should be a cuda.LongTensor, not a cuda.FloatTensor.

(曾宏伟) #17

Sorry, I write wrong, it is also not available for cuda.LongTensor.

(Alban D) #18

is y_onehot a cuda tensor?

(曾宏伟) #19

Thank you, my problem. I ignored the y_onehot type is not a cuda tensor.

(Praveen) #21

Thank you! I got my cvae implemented with this tip.