I have some training text data in variable lengths. I first feed that in an char-based Embedding, then padding using pack_padded_sequence, feeding in LSTM, and finally unpacking with pad_packed_sequence.
At this moment, I have a Variable of BATCH_SIZE*PAD_LENGTH*EMBEDDING_LEN and another Variable of the real length of each piece of data points. In this way, if I want the last output of each output for further steps and backprop (that said, I want to skip those with 0s as input due to padding), what is the best way to do so? Thanks!
Thanks for the reply! As mentioned above, I got 64*20*1024 (i.e., data) for result and 64 (i.e., lens) for a variable that contains lengths (all <= 20 in my case). When I did data[:, lens - 1], I got a variable of 64*64*1024. Was I doing something bad?
sorry, I thought I had tested it, but I actually seem to have done something different. So this might work: data[torch.arange(64, out=torch.LongTensor()), lens - 1]
I just wanna get back that actually the code still gives a result of Batch_size x Batch_size x Emb_len (64x64x1024). Have you tried to run the code somehow?