Understanding the behavior of targets while feeding rnn.PackedSequence

I can quote from the documentation, while I am using torch.nn.LSTM:

If a torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence.

In my code I declared: rnn = nn.LSTM(inpSiz, hidSiz, 2, batch_first=True)

I then fed X. X is a packed sequence of autograd variable, and deep inside, it has a TensorFloat of size (batchSiz,maxLen,dims). I got the output of the LSTM and which in turn was passed to log_softmax() producing the output (batchSiz*maxLen,prob_distribution_over_10_classes). The log probabilities were then fed into nn.NLLLoss() with targets t. t is also an autograd.Variable, and deep inside, has size (batchSiz*seqLen,). Please note that seqLen != maxLen. So that gave me the following error

RuntimeError: Assertion ‘THIndexTensor_(size)(target, 0) == batch_size’ failed. at /py/conda-bld/pytorch_1493673470840/work/torch/lib/THNN/generic/ClassNLLCriterion.c:50

Upon receiving this error, I padded the targets t with _0_s as well, like I did it for X. Now everything worked. t is of size (batchSiz*maxLen), and everything worked. I don’t find any useful information given on t while a n rnn.PackedSequence is fed. Am I doing it right? Another question; is padding _0_s okay? Because in one-hot encoding I have the first class as 0.

from what I read, you aren’t doing anything wrong.

1 Like