From Feedforward Network into an RNN

JaeGer · June 29, 2018, 11:16am

Hello ,
I have been creating different recurrent models ( rnn , lstm , gru ) which gave me a good OA of 86% . I read in a deep learning forum that it’s a good idea to input your initial data into a standard feed forward network then the output directly into your recurrent model. I used a data structure like this for my recurrent model [batch_size , seq_dim , features] , now I have to re-configure this data to properly be my initial input for my FFN model as far as I can know the input should be [batch_size, features] . My question is How to re-configure my initial data to get through the FFN and get and output in the form of my RNN data structure.
I hope you understood my issue.
And thank you.

tom · June 29, 2018, 2:15pm

Typically, the feedforward part treats time-steps as either as unrelated or as a dimension.
Thus you would do something like

seq_len, batch_size, feature_size = input.shape
preprocessed_input = feed_forward_new(input.view(seq_len * batch_size, -1)).view(seq_len * batch_size, -1)

(adapt if your input or output is multidimensional).
An alternative can be to include convolutions over timesteps. Then you need to .permute the batch to the front

input_reordered = input.permute(1, 0, 2).unsqueeze(1) # Batch x channel(=1) x H (= time) x W (=features)
preprocesses_input_raw = my_conv_2d_net(input_reordered) # batch x channel (=features) x time x features
batch, c, time_out, f = preprocesses_input.shape # c and f will give one large feature vector
preprocesses_input = preprocesses_input_raw.permute(2, 0, 1, 3).reshape(time_out, batch, c * f)

now you can pass preprocessed input to an RNN.

Best regards

Thomas

JaeGer · June 29, 2018, 5:24pm

Hello @tom ,
Thank you for your quick reply, well I thought about " spreading " my data (seq_len * batch_size) but it will even complicated things making it [25*100] , I was thinking of maybe creating something like this: a feedforward net for each time step so my initial input will be split into seq_dim * [batch_size , features] pass each of this new tensors through the same FFN , we’ll call this model_in then combines the outputs of the FFNs to create our input for the RNN [,batch_size, seq_len , features (maybe I will change this depending on the output from FFN ) ] , these are my thoughts on the matter , can you give me your feedback ( plausibility , complexity … )
And Thank you
Cheers

ptrblck · June 29, 2018, 7:48pm

You could achieve this behavior using the nn.Linear layer.
Just permute your input, so that your dimensions are [batch_size, seq, in_features] and the linear layer will be applied for all seq using in_features.
Have a look at the doc for the shape information.