Deep Learning methods for Text Generation

OverclockRo · September 30, 2019, 9:59am

Greetings to everyone,

I want to design a system that is able to generate stories or poetry based on a large dataset of text, without being needed to feed a text description/start/summary as input at inference time.

So far I did this using RNN 's, but as you know they have a lot of flaws. My question is, what are the best methods to achieve this task at the time? I searched for possibilities using Attention mechanisms , but it turns out that they are fitted for translation tasks.

P.S. I know about GPT-2, Bert, Transformer , etc., but all of them need a text description as input, before the generation and this is not what I’m seeking. I want a system able to generate stories from scratch after training.

Thanks a lot!

vdw · September 30, 2019, 10:08am

What do you mean by “from scratch”? Say you have an encoder-decoder model, you have to give the decoder something to generate the text output. The only thing I see would be that you sample a random vector from your hidden/latent space, give it to the decoder and hope for the best. I tried that a bit with a Variational Autoencoder for sentences, and the results are questionable at best (but I cannot exclude that I made mistakes)

OverclockRo · September 30, 2019, 10:45am

What I mean is that it should be able to generate text like the RNN does.
I give it a “” token and then I refeed its outputs, when it comes to inference time.
GPT-2 requires a starting phrase, for example.

My dream model would be A kind of Encoder-Decoder model where encoder starts form and then expands with each generated token in the decoder, while the decoder would be able to use attention mechanisms on the Encoder, or so to say, over the entire generated text.

I hope I explain in such a way to be understandable.

zhangguanheng66 · September 30, 2019, 2:03pm

I would suggest you to take a look at the word language model. During inference, you need to provide a random number as input.