Using Word2Vec with LSTM RNN?

cooganb · November 28, 2017, 8:15pm

Hi friends!

I’m using an LSTM RNN to do character generation similar to the Shakespeare generator. My characters are encoded in a one-hot vector.

I want to experiment with word-level analysis. My question is: can I train a word2vec model on my corpus and use that to embed the corpus before feeding it into the LSTM RNN model?

I’ve tried using embedding on my own and, no surprise, the accuracy does not get any better. I’m assuming this is because there’s no inherent “meaning” in the embedding. If I can create “meaning” by training my own word2vec on the corpus, would that help the LSTM RNN train?

Thanks in advance for your time,
C

SimonW · November 28, 2017, 8:49pm

Using word2vec is fine.

I’m a bit doubtful about your argument of “no inherent meaning”. Embeddings are trained together with RNN. So they should adapt to have some sort of meaning. Are you saying that your word-level model (with embedding) gives similar performance with char-level model? Perhaps try increasing embedding dimension or hidden size? Using larger amount of training data can also probably help.

cooganb · November 29, 2017, 2:39am

Thanks so much for the reply!

My word-level model (with embedding) gave really poor performance and then didn’t increase accuracy over time. Now, I had it on a very small embedding dimension (10), could that have been the problem?