out,(h_n,c_n)=LSTM(input,(h_0,c_0))
c_n is a tensor containing the hidden state for last time step n. I would like getting the memory cell c_n tensor of all time steps .what should I do? LSTMcell?
out,(h_n,c_n)=LSTM(input,(h_0,c_0))
c_n is a tensor containing the hidden state for last time step n. I would like getting the memory cell c_n tensor of all time steps .what should I do? LSTMcell?
can a nn.LSTM() be consisted of nn.LSTMcell()?
No, not directly. LSTM
wraps a lot of functionality around LSTMcell
, such as multiple layers and bidirectionality. LSTM
also allows you to give it a while sequence.
What you could do is to get all memory cell c_i
for each time step given a sequence seq
is to use LSTM
(or LSTMcell
, I suppose) and process your sequence in a loop step by step, e.g.:
c_cells = []
for i in range(len(seq)):
input = seq[i]
out, hidden = lstm(input, hidden)
c_cells.append(hidden[1])
Please not that this is just crude pseudo code.
thanks for your reply . I tried your practice. Certainly, I can get all the memory cell vector by using a loop, but the results are different between using nn.LSTM directly and using a loop of LSTM . I am confused.
the results of hidden states h_n are different between using nn.LSTM directly and using a loop of LSTM
I assume that you wanted to write that the results are “different”
Looping over an RNN is quite common, e.g., in sequence-to-sequence models (see, e.g., the Seq2Seq PyTorch tutorial). Check the code of the decoder(s). Not that the forward
method of the decoder does not contain a loop but only takes one word (i.e., one time step) at a time. The loop is outside in the train
method.
Both approaches are fine, you only need to take care that you accumulate the loss properly; best check the tutorial link above. I also think you cannot use bidirectional=True
when creating your nn.LSTM
. Other than that, there shouldn’t really be a (major) difference to process a sequence “manually” using a loop or let nn.LSTM
do everything under the hood.
I will show my code :
#####################------------1------------#####################
init():
self.LSTM_audio = nn.LSTM(Audio_dim, Audio_dim, 1, batch_first=True, bidirectional=False)
h0c0_audio = (Variable(torch.zeros(1, self.batch_size, self.Audio_dim).cuda()),
Variable(torch.zeros(1, self.batch_size, self.Audio_dim).cuda()))
forward():
out_audio, (hn_a,cn_a) = self.LSTM_audio( audio, h0c0_audio )
#####################------------2------------#####################
init():
LSTM = nn.LSTM( Mat_size, hidden_size, 1, batch_first=True, bidirectional=False )
states = (Variable(torch.zeros(1, self.batch_size, self.hidden_size).cuda()),
Variable(torch.zeros(1, self.batch_size, self.hidden_size).cuda()))
hiddens = Variable(torch.zeros(self.batch_size, self.seq_len, self.hidden_size).cuda())
forward():
for time_step in range( self.seq_len ):
data_seq = x_t[ :, time_step, : ]
data_seq = data_seq.unsqueeze( 1 )
out, self.states = self.LSTM( data_seq, self.states )
hiddens[ :, time_step, : ] = out
Question: I find the value of "out_audio in 1 " and "hiddens of 2 " are different .I am very confused