How to perform roll-out of an RNN In pytorch

evanthebouncy · May 18, 2018, 4:28am

I trained an RNN (a GRU to be specific) to be a language model roughly as follows:

inputs = [<start>, tok1, tok2, tok3, . . .]

outputs = [tok1, tok2, tok3, . . .]

h0 = initial state of all-zeros

gru_outputs, hn = gru.sample_rollout(inputs, h0)

cost = loss (gru_outputs, outputs)

Essentially during training there is a “guide” input, where I feed in the target token from the last-step to the current-step (teacher forcing?).

Now this RNN trains well, and converges. But I have no idea how to actually use it to generate a sequence from scratch. As the gru expect an explicit input argument, how do I tell it to use its own output from the last step without giving it an input?

Essentially I want

roll_out = gru(h0, step_number = 10)

Couldn’t quite figure out how to do this. Do I need to use a different API and do manual roll-out in both training and using?

full-code in this gist if it is of any use: https://gist.github.com/evanthebouncy/b5039dc72d3d9fea66dad3306e479e6b

thanks !

–evan