Stack CNN on the output of LSTM for text classification

I’m trying to perform convolution on the output of LSTM in pytorch for text classification, while implementing, I discovered I needed to clone the output of the lstm to perform convolution on it as shown below:

# lstm representation 
output, (final_hidden_state, final_cell_state) = self.lstm(embedded, (h_0, c_0))
#clone the output gotten from lstm
#unsqueeze the output in dim = 1
conInput = conInput.transpose_(0, 1).unsqueeze(1)
#peform convolution on the processed output of lstm
conved_0 = F.relu(self.conv_0(conInput).squeeze(3))
conved_1 = F.relu(self.conv_1(conInput).squeeze(3))
conved_2 = F.relu(self.conv_2(conInput).squeeze(3))

Am I doing the right thing? Is there a better approach?

  1. Does not using clone() produces an error? If so, I suggest removing the clone and instead change the transpose_ to the non in-place version and see if that works.
  2. I guess that the transpose_ would still be needed if batch_first of the lstm is set to true. Depending on ur case this might be wrong.
  3. If u don’t need the hidden or the cell states(the cell states are likely to be useless in most cases), there is no need to keep the variables around.

Not using clone() produces an error actually. Oh! Ok thanks. I would fix those points you raised and revert.