Apply Sequential to every RNN output in the Sequence

timbmg · July 20, 2017, 9:32am

I have a LSTM network which returns me an output for each input. So when I have an input tensor of shape [seq_length, batch_size, features], it will return me [seq_length, batch_size, hidden_size].

Now I want to apply a fully connected network to each output. However as far as I understand nn.Sequential(nn.Linaer(...)) does only allow to process shapes like [batch_size, features].

However, my input would have the shape [seq_length, batch_size, hidden_size]. And the output shape should than look like [seq_length, batch_size, fc_last_layer_dim]. Is there any way it can process three dimensions in that fashion?

dgriff · July 20, 2017, 9:49am

I just explained how one could do here:

You don’t want to use sequential its not best module for this and should make own custom nn.model to do. Let me know if need more explaining

timbmg · July 20, 2017, 10:24am

Thank you, but I could not really see how this applies to my situation.

Are you suggesting that I would require a custom nn.Module subclass which basically does the same as Sequential except has an additional for-loop in the forward to go through the third dimension?

dgriff · July 20, 2017, 10:50am

rnn = nn.LSTM(10, 20, 2)
input = Variable(torch.randn(5, 3, 10))
h0 = Variable(torch.randn(2, 3, 20))
c0 = Variable(torch.randn(2, 3, 20))
output, hn = rnn(input, (h0, c0))

input (seq_len, batch, input_size)
output (seq_len, batch, hidden_size * num_directions): tensor containing the output features (h_t) from the last layer of the RNN, for each t.

I don’t know of anyway to do this using the sequential module

And yes I’m suggesting loop on:
Output, (hx, cx) = model(input, (hx,cx))

I’m walking my dog right now and using phone for this but can provide actual code example later if you need when with computer

timbmg · July 20, 2017, 11:23am

Well, the problem is not with the LSTM, that works just fine. But the fully connected layer applied to the LSTM output can not process the additional dimension. So maybe I’m not understanding correctly what you are suggesting.

dgriff · July 20, 2017, 11:43am

U will need something like this before LSTM part in forward

x = x.view(x.size(0), -1)

Sorry missed that part I see

himat · January 24, 2018, 6:38am

@timbmg were you able to figure out how to do it? I’m in the exact same scenario.

timbmg · January 24, 2018, 9:32am

Yes. I merged the sequence and batch dimension:

x = x.view(seq_length*batch_size, hidden_dim)

Then process it through the mlp, which you can do now. The first dimension acts as batch.

x = mlp(x)

Eventually you can reshape it back:

x = x.view(seq_length, batch_size, mlp_out_dim)