Trouble using an LSTM in a seq2seq model

raconma · May 29, 2019, 11:15am

I’m doing a project starting from the seq2seq translation tutorial and I was trying to change the GRU rnn to an LSTM one but I’m having some problems with the dimension in the Encoder. I have tryied a lot of changes but I’m kind of new to this and I don’t know where I’m messing up.
Encoder:

class EncoderRNN(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(EncoderRNN, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size

        self.embedding = nn.Embedding(input_size, hidden_size)

        self.lstm = nn.LSTM(hidden_size, int(hidden_size), num_layers=1, batch_first=True, bidirectional=True)

    def forward(self, input, hidden):
        embedded = self.embedding(input).view(1, 1, -1)
        output = embedded
        output, hidden = self.lstm(output, hidden)
        return output, hidden

    def initHidden(self):
        return (torch.zeros(1, 1, self.hidden_size, device=device),torch.zeros(1, 1, self.hidden_size, device=device))

And I’m getting the following error:

RuntimeError Traceback (most recent call last)
in ()
4 attn_decoder1 = AttnDecoderRNN(hidden_size, output_lang.n_words, dropout_p=0.1).to(device)
5
----> 6 trainIters(encoder1, attn_decoder1, 75000, print_every=5000)

9 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/rnn.py in check_hidden_size(self, hx, expected_hidden_size, msg)
170 # type: (Tensor, Tuple[int, int, int], str) → None
171 if hx.size() != expected_hidden_size:
→ 172 raise RuntimeError(msg.format(expected_hidden_size, tuple(hx.size())))
173
174 def check_forward_args(self, input, hidden, batch_sizes):

RuntimeError: Expected hidden[0] size (2, 1, 256), got (1, 1, 256)

Hope you can help me.