the cross entropy loss doesn’t know about timesteps or multiple classes. Last time I needed that for a single class, I used
loss = lossfn(scores.view(-1,batch_size*time_steps), labels.contiguous().view(-1))
(the contiguous was needed because the view failed without due to the minibatch preparation method, you could try to do without).
If you have three labels, you might just hand back three score vectors and add three cross entropy losses.
But this is only one way to do it, and you might look at what best fits your purpose. For example Sean Robertson
just adds the losses over the sequence steps in his RNN-for-Shakespeare tutorial (the notebook is an excellent read, too, but it is harder to link to specific lines), probably because the outputs are generated one by one anyways.