You can use the scatter_ method to achieve this.
I would also advise to create the y_onehot tensor once and then just fill it:
batch_size = 5
nb_digits = 10
# Dummy input that HAS to be 2D for the scatter (you can use view(-1,1) if needed)
y = torch.LongTensor(batch_size,1).random_() % nb_digits
# One hot encoding buffer that you create out of the loop and just keep reusing
y_onehot = torch.FloatTensor(batch_size, nb_digits)
# In your for loop
y_onehot.scatter_(1, y, 1)
@Nadav_Bhonker we’re working on adding more and more support for sparse operations, but the our fastest RNN backend (i.e. cuDNN) doesn’t support sparse inputs anyway. I’d recommend using Embedding for that.
And also for future readers just to reiterate what user moskomule says- cross entropy and neg. log-likelihood losses in pytorch do NOT require one-hot encodings, so you can just use the normal target vector.
@rajarsheem that’s not a very good idea if your vector dimensionality is large. You’ll end up storing a huge weight matrix in memory, and in your code emb.weight requires gradient and it might be subject to optimization if you don’t take care.
Additionaly, zero + scatter a few ones will be much faster than copying whole rows, of which most values are 0 anyway.
Hi @albanD adn @apaszke , I was trying to use the scatter function, but I am running into some troubles.
in my case I have something like this:
y = torch.LongTensor(batch_size,5,5).random_() % 3#3 classes,5x5 img
y_onehot = torch.FloatTensor(batch_size,3, 5,5)#I want the one hot going through the chans dim
However, it gives me the following error Index tensor must have same dimensions as output tensor at /data/users/soumith/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:450
There is no reason for y to be a Variable here.
And since y is used to index in a tensor, it needs to have a proper indexing type: LongTensor.
So if you use cuda, y should be a cuda.LongTensor, not a cuda.FloatTensor.