Stack CNN on the output of LSTM for text classification

alabijesujoba · February 18, 2020, 10:52am

I’m trying to perform convolution on the output of LSTM in pytorch for text classification, while implementing, I discovered I needed to clone the output of the lstm to perform convolution on it as shown below:

# lstm representation 
output, (final_hidden_state, final_cell_state) = self.lstm(embedded, (h_0, c_0))
#clone the output gotten from lstm
conInput=output.clone()
#unsqueeze the output in dim = 1
conInput = conInput.transpose_(0, 1).unsqueeze(1)
#peform convolution on the processed output of lstm
conved_0 = F.relu(self.conv_0(conInput).squeeze(3))
conved_1 = F.relu(self.conv_1(conInput).squeeze(3))
conved_2 = F.relu(self.conv_2(conInput).squeeze(3))

Am I doing the right thing? Is there a better approach?

G.M · February 18, 2020, 11:26am

Does not using clone() produces an error? If so, I suggest removing the clone and instead change the transpose_ to the non in-place version and see if that works.
I guess that the transpose_ would still be needed if batch_first of the lstm is set to true. Depending on ur case this might be wrong.
If u don’t need the hidden or the cell states(the cell states are likely to be useless in most cases), there is no need to keep the variables around.

alabijesujoba · February 18, 2020, 11:47am

Not using clone() produces an error actually. Oh! Ok thanks. I would fix those points you raised and revert.