Apply Sequential to every RNN output in the Sequence

I have a LSTM network which returns me an output for each input. So when I have an input tensor of shape [seq_length, batch_size, features], it will return me [seq_length, batch_size, hidden_size].

Now I want to apply a fully connected network to each output. However as far as I understand nn.Sequential(nn.Linaer(...)) does only allow to process shapes like [batch_size, features].

However, my input would have the shape [seq_length, batch_size, hidden_size]. And the output shape should than look like [seq_length, batch_size, fc_last_layer_dim]. Is there any way it can process three dimensions in that fashion?

1 Like

I just explained how one could do here:

You don’t want to use sequential its not best module for this and should make own custom nn.model to do. Let me know if need more explaining :+1:

Thank you, but I could not really see how this applies to my situation.

Are you suggesting that I would require a custom nn.Module subclass which basically does the same as Sequential except has an additional for-loop in the forward to go through the third dimension?

rnn = nn.LSTM(10, 20, 2)
input = Variable(torch.randn(5, 3, 10))
h0 = Variable(torch.randn(2, 3, 20))
c0 = Variable(torch.randn(2, 3, 20))
output, hn = rnn(input, (h0, c0))

input (seq_len, batch, input_size)
output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t.

I don’t know of anyway to do this using the sequential module

And yes I’m suggesting loop on:
Output, (hx, cx) = model(input, (hx,cx))

I’m walking my dog right now and using phone for this but can provide actual code example later if you need when with computer

Well, the problem is not with the LSTM, that works just fine. But the fully connected layer applied to the LSTM output can not process the additional dimension. So maybe I’m not understanding correctly what you are suggesting.

U will need something like this before LSTM part in forward

x = x.view(x.size(0), -1)

Sorry missed that part I see

@timbmg were you able to figure out how to do it? I’m in the exact same scenario.

Yes. I merged the sequence and batch dimension:

x = x.view(seq_length*batch_size, hidden_dim)

Then process it through the mlp, which you can do now. The first dimension acts as batch.

x = mlp(x)

Eventually you can reshape it back:

x = x.view(seq_length, batch_size, mlp_out_dim)