How to using word embedding with optimizer?

Hi Friends!

I’m building an RNN for language generation similar to the classic Shakespeare RNN text generator.

However, I’d like to employ word-level analysis. Using one-hot encoding doesn’t work because my vocabulary size is large (even after cutting out low-frequency terms) and the sequences are long (meaning the graph really adds up over the course of a sequence). My GPU can’t handle the 10,000+ vocab.

I’d like to use nn.embed(), but I can’t figure out how to configure it with the optimizer. Right now, the optimizer expects a LongTensor with a target–how do I adjust it to expect an embedded word vector of a dimension, say 1 x 100?

Thanks in advance for any help!

You are confusing loss function for optimizer. If you want, change the loss function (what you say about LongTensor is specific to NLLLoss or CrossEntropyLoss)

Right! Sorry about that.

So my criterion is defined right before running my training with this code:

criterion = nn.CrossEntropyLoss()

I can choose another loss function other than CrossEntropyLoss? Which would you recommend?