How to get all the memory cell vertor of LSTM

qwe · May 20, 2019, 1:42pm

out,(h_n,c_n)=LSTM(input,(h_0,c_0))

c_n is a tensor containing the hidden state for last time step n. I would like getting the memory cell c_n tensor of all time steps .what should I do? LSTMcell?

qwe · May 20, 2019, 1:55pm

can a nn.LSTM() be consisted of nn.LSTMcell()?

vdw · May 21, 2019, 12:58am

No, not directly. LSTM wraps a lot of functionality around LSTMcell, such as multiple layers and bidirectionality. LSTM also allows you to give it a while sequence.

What you could do is to get all memory cell c_i for each time step given a sequence seq is to use LSTM (or LSTMcell, I suppose) and process your sequence in a loop step by step, e.g.:

c_cells = []
for i in range(len(seq)):
    input = seq[i]
    out, hidden = lstm(input, hidden)
    c_cells.append(hidden[1])

Please not that this is just crude pseudo code.

qwe · May 21, 2019, 2:09am

thanks for your reply . I tried your practice. Certainly, I can get all the memory cell vector by using a loop, but the results are different between using nn.LSTM directly and using a loop of LSTM . I am confused.

qwe · May 21, 2019, 2:27am

the results of hidden states h_n are different between using nn.LSTM directly and using a loop of LSTM

vdw · May 21, 2019, 2:29am

I assume that you wanted to write that the results are “different”

Looping over an RNN is quite common, e.g., in sequence-to-sequence models (see, e.g., the Seq2Seq PyTorch tutorial). Check the code of the decoder(s). Not that the forward method of the decoder does not contain a loop but only takes one word (i.e., one time step) at a time. The loop is outside in the train method.

Both approaches are fine, you only need to take care that you accumulate the loss properly; best check the tutorial link above. I also think you cannot use bidirectional=True when creating your nn.LSTM. Other than that, there shouldn’t really be a (major) difference to process a sequence “manually” using a loop or let nn.LSTM do everything under the hood.

qwe · May 21, 2019, 2:53am

I will show my code :
#####################------------1------------#####################
init():
self.LSTM_audio = nn.LSTM(Audio_dim, Audio_dim, 1, batch_first=True, bidirectional=False)
h0c0_audio = (Variable(torch.zeros(1, self.batch_size, self.Audio_dim).cuda()),
Variable(torch.zeros(1, self.batch_size, self.Audio_dim).cuda()))

forward():
out_audio, (hn_a,cn_a) = self.LSTM_audio( audio, h0c0_audio )

#####################------------2------------#####################
init():
LSTM = nn.LSTM( Mat_size, hidden_size, 1, batch_first=True, bidirectional=False )
states = (Variable(torch.zeros(1, self.batch_size, self.hidden_size).cuda()),
Variable(torch.zeros(1, self.batch_size, self.hidden_size).cuda()))
hiddens = Variable(torch.zeros(self.batch_size, self.seq_len, self.hidden_size).cuda())

forward():
for time_step in range( self.seq_len ):
data_seq = x_t[ :, time_step, : ]
data_seq = data_seq.unsqueeze( 1 )

out, self.states = self.LSTM( data_seq, self.states )
hiddens[ :, time_step, : ] = out

Question: I find the value of "out_audio in 1 " and "hiddens of 2 " are different .I am very confused