hi,
I am making a custom model where the output of a lstm is the input of a linear layer. The shape of the last state h_T has shape (1, batch_size, n_features), while the linear model expects an input with shape (batch_size, n_features). Is it correct to squeeze h_t? I am concerned the error may not be backpropagated from the linear layer to the lstm if I squeeze h_T.
def forward(self, x):
lstm_out, (h_T, c_T) = self.lstm(x)
fc_out = self.fc(h_T.squeeze(0))
return fc_out