Connecting encoder output to decoder input

Hi all,

I have a encoder network with 3 layer and hidden size of 40 and an decoder network with 5 layer and hidden size of 30. How should I connect the output of encoder network to the input of decoder network?

As I know nn.Linear may works, but nn.Linear input is (N, *, in_feature) while encoder output is (3, N, hidden_size) and I couldn’t figure out how should I convert (3, N, hidden_size) vector to suitable input for nn.Linear module. I used view method, but it changes orders.

Are they pre-trained or is this your own architecture you want to train?

If you are using pre-trained networks, you can still apply a pooling layer in between, or (that’s better) add a linear layer that your train by transfer learning.

They are not pre-trained and are my own architecture. I want to have flexible architecture.

Adding linear layer is the solution, but I don’t know how reshape the tesnors of encoder output to fit the decoder input, for example if encoder hidden output have shape (enc_num_layer, batch_size, enc_hidden_size) and decoder hidden input have shape (dec_num_layer, batch_size, dec_hidden_size) and enc_num_layer * enc_hidden_size = dec_num_layer * dec_hidden_size, how should I prepare input for linear layer? I used view method, but it change element orders.

To convert (3, N, hidden_size) to (N, *, in_feature) use .permute(1,0,2)
Permute interchanges the dimensions.

Besides why not make the first layer of your decoder take input of shape (3, N, 40)?

1 Like

The decoder input is now (3, N, 40) but because of poor performance on evaluation dataset I decide to increase model flexibility, am I wrong?