CNN + RNN + Fully Connected

Hi all,

I have a doubt about hidden dimensions. If I create a neural network like:

Layer 1 --> Convolutional Network
Layer 2 --> RNN (GRU or LSTM)
Layer 3 --> Fully connected linear

How do I handle the hidden outputs used by the RNN because CNN won’t need them…


Sorry, I am not sure if I follow the question clearly. If the order of layers in 1 --> 2 --> 3 which means RNN comes after CNN and then of course CNN won’t need anything from RNN. Instead, if the order is 3 --> 2 --> 1 then you can just choose not to use the hidden outputs from RNN when going into CNN.

for example:

output = conv2d
output, h = rnn(x, h)

this h should be proceeded as usual in the training layer and the forward function?

If the h has no role to play in the fully connected linear layer output, _ = rnn(x) should suffice unless you want to initialize h yourself at which point output, _ = rnn(x, h) will do.

But then we aren’t taken maximum advantage of the hidden layers of the RNN, that’s why I am asking…

Taking advantage of the hidden outputs is driven by the choice of the architecture.

Thanks, then my question was wrong formulated.

If I want to take advantage of the hidden outputs because I am working on sequential data. Is this correct?


   def forward(self, x,  hidden):
         x = Conv2d....
         x, h = rnn(x,h)
        return x,h

If you want to use the hidden outputs outside your forward function (for example, if you want to do seq2seq where the hidden outputs from the encoder are used in a decoder) then yes your formulation is fine (I am assuming the hidden passed into the forward function is passed on to the rnn as h).

Hi, I am having the same issue. I would like to do a CNN-GRU speaker identification task on preprocessed spectograms. How could you connect the two different NN-s?