Word language model to assign probabilities to sentences in test data?

Ted_Pedersen · October 3, 2017, 1:37am

Greetings all,

I’m very new to PyTorch and have been looking through some of the examples. I’m especially interested in language modeling, and would like to train the word_language_model on a corpus (PTB data, or any other). The example seems to show how to do the training :

but then it goes on to randomly generate text and then estimate the perplexity. Instead of this, I would like to actually provide a series of sentences (which weren’t in the training data) to which probabilities are assigned by the language model learned from the training data. So if my test data is :

I would like to talk to you for a minute.
I would like to talk to you for a minuet.
I would like to talk to you for a fdasdf.

I would like to get a probability for each of these sentences based on the language model that has been learned. The example doesn’t seem to show how to deal with a separate test set, and how to get probabilities for each of the test sentences. Can anyone offer any guidance on this? This example is just a starting point for me so if there are other examples, etc. that are closer to what I’d like to do pointers to those would be great.

Thanks!
Ted

rmakki · February 7, 2019, 4:40pm

Hi Ted,

I was wondering if you found the answer to your question? I am dealing with the same problem.

Thanks,

lauqasim · February 17, 2019, 4:15am

Hi Ted,
if you soved this problem, please email me:763738799@qq.com.
thanks

lauqasim · February 17, 2019, 4:16am

if you soved this problem, please email me:763738799@qq.com.
thanks