I’m trying to convert the y labels in mnist data into one-hot format.

Since I’m not quite familiar with PyTorch yet, for each iteration, I just convert the y to numpy format and reshape it into one-hot and then convert it back to PyTorch. Like that

for batch_idx, (x, y) in enumerate(train_loader):
y_onehot = y.numpy()
y_onehot = (np.arange(num_labels) == y_onehot[:,None]).astype(np.float32)
y_onehot = torch.from_numpy(y_onehot)

However, I notice that the it gets slower each iteration, and I doubt it’s these code which might request new memory each iteration that makes the code slower.

So my question is, is there a more PyTorch way, which may help me avoid such conversion?

HI, it depends on your loss function, but some PyTorch’s loss functions take class labels as their targets(e.g. NLLloss). So if you use them, you don’t need to convert targets into onehot vectors.

You can use the scatter_ method to achieve this.
I would also advise to create the y_onehot tensor once and then just fill it:

import torch
batch_size = 5
nb_digits = 10
# Dummy input that HAS to be 2D for the scatter (you can use view(-1,1) if needed)
y = torch.LongTensor(batch_size,1).random_() % nb_digits
# One hot encoding buffer that you create out of the loop and just keep reusing
y_onehot = torch.FloatTensor(batch_size, nb_digits)
# In your for loop
y_onehot.zero_()
y_onehot.scatter_(1, y, 1)
print(y)
print(y_onehot)

Isn’t there a more efficient way to input a “sparse Tensor” or a vector of indices into the network (specifically RNNs)?
I guess something similar to torch’s sparse linear (only for RNNs).

@Nadav_Bhonker we’re working on adding more and more support for sparse operations, but the our fastest RNN backend (i.e. cuDNN) doesn’t support sparse inputs anyway. I’d recommend using Embedding for that.

just a note (from my understanding… maybe it doesnt apply in this case) it is currently advised to NOT follow this approach of creating the variable once and filling it each time (see How to use Batch normalization in testing model)

And also for future readers just to reiterate what user moskomule says- cross entropy and neg. log-likelihood losses in pytorch do NOT require one-hot encodings, so you can just use the normal target vector.

@ncullen93 the whole thread was about converting tensors I think, so it doesn’t apply there. But both of your statements are correct and are should be followed. Thanks

@rajarsheem that’s not a very good idea if your vector dimensionality is large. You’ll end up storing a huge weight matrix in memory, and in your code emb.weight requires gradient and it might be subject to optimization if you don’t take care.

Additionaly, zero + scatter a few ones will be much faster than copying whole rows, of which most values are 0 anyway.

Hi @albanD adn @apaszke , I was trying to use the scatter function, but I am running into some troubles.
in my case I have something like this:

batch_size=10
y = torch.LongTensor(batch_size,5,5).random_() % 3#3 classes,5x5 img
y_onehot = torch.FloatTensor(batch_size,3, 5,5)#I want the one hot going through the chans dim
y_onehot.zero_()
ones=torch.ones(y.size())
y_onehot.scatter_(1,y,ones)

However, it gives me the following error Index tensor must have same dimensions as output tensor at /data/users/soumith/builder/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:450

There is no reason for y to be a Variable here.
And since y is used to index in a tensor, it needs to have a proper indexing type: LongTensor.
So if you use cuda, y should be a cuda.LongTensor, not a cuda.FloatTensor.