I am trying to replicate a paper (Deep Learning for Household Load Forecasting—A Novel Pooling Deep RNN | IEEE Journals & Magazine | IEEE Xplore) and in the paper the authors use LSTM with input of size
[batch_size, seq_len, input_size] to get the output of
[batch_size, 1] - one value for each input sequence in the batch. I think the standard way to do this is to take the last hidden state and input it to Linear layer to map to this one single value per each batch example, however the authors are not mentioning anything about using a fully connected layer on top of LSTM or how they achieved the output specifically. Is it possible to get this kind of output from LSTM without using fully connected layer on top of it?
The output of LSTM will be of shape
[batch_size, seq_len, hidden_size] so you could take the last time step from seq_len, but then you will still have the hidden_size dimension, so it won’t give
[batch_size, 1] output. Is there some other way or did the authors just omitted part of the solution?