I’m very new to PyTorch and have been looking through some of the examples. I’m especially interested in language modeling, and would like to train the word_language_model on a corpus (PTB data, or any other). The example seems to show how to do the training :
but then it goes on to randomly generate text and then estimate the perplexity. Instead of this, I would like to actually provide a series of sentences (which weren’t in the training data) to which probabilities are assigned by the language model learned from the training data. So if my test data is :
I would like to talk to you for a minute.
I would like to talk to you for a minuet.
I would like to talk to you for a fdasdf.
I would like to get a probability for each of these sentences based on the language model that has been learned. The example doesn’t seem to show how to deal with a separate test set, and how to get probabilities for each of the test sentences. Can anyone offer any guidance on this? This example is just a starting point for me so if there are other examples, etc. that are closer to what I’d like to do pointers to those would be great.