Maybe it is a stupid question but I still don’t understand, why the output of LSTMCell only consists of hx and cx? Should I add another feed-forward layer to compute o(t) based on hx(t-1) and x(t) ?
Thank you.
hx is what is the output. You can try that in nn.LSTM and compare the hx to the last output.
Besteht regards
Thomas
But the computation of o(t), c(t) and h(t) are different,
Is it ok to regard h(t) directly as o(t) ?
Well, h already has o applied. Keep in mind that is the output gate, not the output.