I am trying to train a character level language model with multiplicative LSTM.
Now i can train on individual sequences (batch_size 1 in other words) like this:
x - current character, y - next character
TIMESTEPS = len(x) for t in range(TIMESTEPS): emb = embed(x[t]) hidden, output = rnn(emb, hidden) loss += loss_fn(output, y[t])
My problem is how to scale it up to batch processing, given that all my sequences are with different length?