[solved] Concatenate time distributed CNN with LSTM

I assume that your input data is of shape (batch_size, timesteps, C, H, W)

Instead of TimeDistributed, you can use .view() to combine the batch and time dimensions before running the convolutions, then you can use .view() to separate the batch and time dimensions and to flatten the features before running the LSTM. Something like this…

class Combine(nn.Module):
    def __init__(self):
        super(Combine, self).__init__()
        self.cnn = CNN()
        self.rnn = nn.LSTM(320, 10, 2, batch_first=True)

    def forward(self, x):
        batch_size, timesteps, C, H, W = x.size()
        c_in = x.view(batch_size * timesteps, C, H, W)
        c_out = self.cnn(c_in)
        r_in = c_out.view(batch_size, timesteps, -1)
        r_out = self.rnn(r_in)
        return F.log_softmax(r_out, dim=1)

This will require the LSTM to be initialised with the batch_first=True option.

7 Likes