I trained an RNN (a GRU to be specific) to be a language model roughly as follows:
inputs = [<start>, tok1, tok2, tok3, . . .] outputs = [tok1, tok2, tok3, . . .] h0 = initial state of all-zeros gru_outputs, hn = gru.sample_rollout(inputs, h0) cost = loss (gru_outputs, outputs)
Essentially during training there is a “guide” input, where I feed in the target token from the last-step to the current-step (teacher forcing?).
Now this RNN trains well, and converges. But I have no idea how to actually use it to generate a sequence from scratch. As the gru expect an explicit input argument, how do I tell it to use its own output from the last step without giving it an input?
Essentially I want
roll_out = gru(h0, step_number = 10)
Couldn’t quite figure out how to do this. Do I need to use a different API and do manual roll-out in both training and using?
full-code in this gist if it is of any use: https://gist.github.com/evanthebouncy/b5039dc72d3d9fea66dad3306e479e6b