In my Neural network model, I represent each word with a 256 dimensional embedding vector. For a sentence with 8 words, I get
8x256 dimensional matrix. I want to give these matrix to lstm as input so that, LSTM process them one token at a time and then I am going to use its final hidden state.
According to pytorch documentation, the input should be in the shape of
(seq_len, batch, input_size) . In my case,
seq_len will be 8,
batch will be 1 and
input_size will be 256. My question is what is the correct way to convert my input to desired shape ? I don’t want to endup with a matrix whose values are dispositioned. I am quite new in PyTorch and row-major calculations, therefore I wanted to ask it here. I do it as follows, is it correct ?
x = torch.rand(8,256) lstm_input = torch.reshape(x,(8,1,256))
Is it the correct way to do that or should I do something different like taking the transpose first ?
In addition to my specific question, I would be really grateful, if someone provide me a general rules I should careful while changing the shape of any matrix.