How to using word embedding with optimizer?

Hi Friends!

I’m building an RNN for language generation similar to the classic Shakespeare RNN text generator.

However, I’d like to employ word-level analysis. Using one-hot encoding doesn’t work because my vocabulary size is large (even after cutting out low-frequency terms) and the sequences are long (meaning the graph really adds up over the course of a sequence). My GPU can’t handle the 10,000+ vocab.

I’d like to use nn.embed(), but I can’t figure out how to configure it with the optimizer. Right now, the optimizer expects a LongTensor with a target–how do I adjust it to expect an embedded word vector of a dimension, say 1 x 100?

Thanks in advance for any help!

Right now, the optimizer expects a LongTensor with a target

You are confusing loss function for optimizer. If you want, change the loss function (what you say about LongTensor is specific to NLLLoss or CrossEntropyLoss)

1 Like

Right! Sorry about that.

So my criterion is defined right before running my training with this code:

criterion = nn.CrossEntropyLoss()

I can choose another loss function other than CrossEntropyLoss? Which would you recommend?