CNN-LSTM problem

Nassim_Mokhtari · February 11, 2020, 11:06am

Hi,
I have implemented a hybdrid model with CNN & LSTM in both Keras and PyTorch, the network is composed by 4 layers of convolution with an output size of 64 and a kernel size of 5, followed by 2 LSTM layer with 128 hidden states, and then a Dense layer of 6 outputs for the classification.
In fact, i have juste implemented the DeepConvLSTM proposed here https://www.researchgate.net/publication/291172413_Deep_Convolutional_and_LSTM_Recurrent_Neural_Networks_for_Multimodal_Wearable_Activity_Recognition.
My problem is with PyTorch version, i’m getting around 18~19 % of accuracy, while Keras is giving 86~87%. I don’t understand why, i’m using the same parameters for both networks and the same optimizer RMSROP.
I also tried to use GRU instead of LSTM, but getting the same problem, seems like there is a probleme with the hybridation its selfs, but i can not figure it out.
here is my scripts

Keras version :

type or paste cdef ConvLSTM_Keras(input_shape):
    from keras.models import Sequential
    from keras.layers import Dense,Conv1D,LSTM
    model = Sequential()
    model.add(Conv1D(64, 5,
                     activation='relu',
                     input_shape=input_shape))
    model.add(Conv1D(64, 5, activation='relu'))
    model.add(Conv1D(64, 5, activation='relu'))
    model.add(Conv1D(64, 5, activation='relu'))
    model.add(LSTM(128,return_sequences=True))
    model.add(LSTM(128,return_sequences=False))
    model.add(Dense(6, activation='softmax'))
    return model

model.compile(loss=keras.losses.categorical_crossentropy,
                  optimizer=keras.optimizers.rmsprop(
                      learning_rate=0.001
                  ),
                  metrics=['accuracy'])

model.fit(x_train, y_train,
              epochs=20,
              batch_size=100,
              verbose=1,
              validation_data=(x_val, y_val))

PyTorch Version

 def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = torch.nn.Conv1d(in_channels=1, out_channels=64, kernel_size=5)
        self.conv2 = torch.nn.Conv1d(in_channels=64,out_channels=64, kernel_size=5)
        self.conv3 = torch.nn.Conv1d(in_channels=64, out_channels=64, kernel_size=5)
        self.conv4 = torch.nn.Conv1d(in_channels=64, out_channels=64, kernel_size=5)
        self.lstm1 = torch.nn.LSTM(
            input_size= 545,
            hidden_size=128,
            num_layers=2,
        )
        self.fc2 = torch.nn.Linear(128, 6)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x,_ = self.lstm1(x)
        x = x[:, -1, :]
        x = self.fc2(x)
        return (x)

import torch.optim as optim
def createLossAndOptimizer(net, learning_rate=0.001):
    # Loss function
    loss = torch.nn.CrossEntropyLoss()

    # Optimizer
    optimizer = optim.rmsprop(net.parameters(), lr=learning_rate)
    return (loss, optimizer)

Hope you can help me, thanks.
Nassim

ptrblck · February 12, 2020, 7:43am

You could try to get matching results using a few layers (or blocks) and then scaling it up.
E.g. I couldn’t find a lot about how the stacked LSTM layers work, i.e. how is their hidden and state tensor defined, so this would be my starting point.

Nassim_Mokhtari · February 12, 2020, 8:47am

Thanks for your answer, i’ll try it.

G.M · February 12, 2020, 12:22pm

R u trying to use the last hidden state of LSTM only?

Nassim_Mokhtari · February 12, 2020, 12:33pm

Yes, i’m doing it at x = x[:,-1,:]

G.M · February 12, 2020, 1:02pm

Alright, I believe here’s the problem. In ur model, if I assume the input to CNN is of shape [B, 1, L], then the CNN outputs a tensor of shape [B, C, L] where B is the batch size, L is the sequence length, C is the channel. You then fed it into LSTM hoping it would learn the temporal information of dimension 2(L). But what’s actually happening is that, according to here, the LSTM by default assumes the input is of shape [L, B, INPUT_SIZE], so in this case, the LSTM is trying to learn the temporal between dimension 0(B), which apparently are independent with each other. That’s why the network is acting as a random number generator.
To solve this problem, u would have to transpose ur tensor accordingly to match the expected input shape of the LSTM and get the last hidden state according to the output shape of LSTM.

Nassim_Mokhtari · February 12, 2020, 1:05pm

Thanks for your help, i’ll try it, thanks a lot

Nassim_Mokhtari · February 17, 2020, 1:17pm

Sorry for taking a long time to feedback, i was busy with another task, i tried your proposition and now it works perfectly, i’m having matching results between Keras and PyTorch, thanks !
I have another question if you can answer me, i noticed that the pytorch’s implementations of RNNs (LSTM and GRU) are faster then Keras, and acheive better results, do you know why ?

G.M · February 18, 2020, 12:47am

I’m not familiar with keras so that’s out of my reach.

Berna_Yilmaz · January 4, 2021, 7:32pm

@G.M Hi,
Can you share right code in pytorch ?
Thanks.

DrDumbenstein · January 5, 2021, 6:46am

Really helpful thank you.

NancyShehata · September 9, 2022, 3:45am

Hi Nassim,
I am facing the same problem of transposing the tensor, can you please share your code after working and transposing the tensor.
Thank you