Pytorch embedding or lstm (I don’t know about other dnn libraries) can not handle variable-length sequence by default. I am seeing various hacks to handle variable length. But my question is, why this is the case? I mean, sequences almost never the same size/length and rnn/lstm should loop through until the end of a sequence. If so, why it will be sensitive to the various length of the sequences of a minibatch? Pytorch Embedding is a look-up table. To me, I don’t see any reason to be sensitive to variable length. Should not be the ideal case is, I can give a minibatch of sentences with a variable number of words? Like the following:
word_embedding = nn.Embedding(17,5)
#each vector is a sentence
word_embeds=word_embedding(torch.tensor([[1,2,3,4,5],[4,5,6,7]]))