I am pretty new in PyTorch (and python) and therefore this issue confuses me a lot.
I have a LSTM and I use it to process batch of sentences. The task is not important for my question. Basically, I use the final hidden state of the LSTM and multiply it wit a 2 dimensional linear weight matrix. Here is a snippet to illustrate what I’ve described:
>>>BatchSize=2 >>>EmbeddingSize=5 >>>VocabularySize=3 >>> HiddenSize = 4 >>>instance1 = [0,0,1,2,1,0] # a sentence contains 6 words >>>instance2 = [0,1,2,1] # another sentence containing 4 words >>>batch = [instance1,instance2] >>>lstm = nn.LSTM(EmbeddingSize,HiddenSize,num_layers=1,bidirectional=False) >>>emb = nn.Embedding(VocabularySize,EmbeddingSize) >>>embedded = [emb(torch.tensor(s)) for s in batch] >>>packed = nn.utils.rnn.pack_sequence(embedded) >>>output,(h,c) = lstm(packed) >>>print(h.shape," ",c.shape) torch.Size([1, 2, 4]) torch.Size([1, 2, 4])
Now, I want to reshape it to batchSizexHiddenSize dimensional tensor. I can do it as follows:
h = torch.reshape(h,(BatchSize,HiddenSize))
But I wonder, if such an operation causes values from different instances to mixed ?
Please let me know if my question is not clear.