Solution for Excercise in NLP tutorial

I was trying to think of how to implement the exercise mentioned here http://pytorch.org/tutorials/beginner/nlp/sequence_models_tutorial.html

To augment the POS tagger I would have to find charecter level representation of each word. So I was wondering how would the ground truth look like ? For example for the word The, if I passed that to nn.Embedding(number_of_unique_letters,embed_dim) and then passed that to an nn.LSTM and then finally to a fully connected layer, what would the loss compare against ?

1 Like

Loss would be computed the same way - you get the tags and compare them against the ground truth. The only difference is that you help the network with representations of characters.

It’s still pretty unclear, though. The way it’s written in the exercise is that you have to input characters into one LSTM, get the hidden state of it and input that into another LSTM along with words. The only thing I don’t understand is how you can pass both integer words, and hidden state that is of float type (from the previous model).