Pytorch Batched Beam search seq2seq

I am trying to do batched beam search in seq2seq, my batch size=2 and beam size=2. When it comes out from encoder hidden dimension is 1x2x100 [as i don’t consider beam there]. Now as it has to be fed into the decoder with two initial states for two sentences.Do I need to make it 1x4x100 ?

We would have 2 hidden states for each sentence [as there are two sources per sentence[beam size=2] each giving a new hidden state . Which dimension would pytorch put the 4 hidden states into?

I would need to feed four hidden states [2 sentences;2 beam size] which axis’s dimension should be increased?

Should the hidden state be made 1x4x100 for feeding to decoder?

I saw the documentation for hidden and it said (num_layersxnum_directions, batch_size, input_size), so i think it should be (1,2x2 [batch_sizexbeam_size],100)