Hi,
I’m trying to build the Chatbot from the official tutorials and I’m running into two issues. The main issue is the mismatch of dimensions when running through the GRU in the decoder.
def forward(self, input_step, last_hidden, encoder_output):
# we run this one step (word) at a time
embedded = self.embedding(input_step)
embedded = self.embedding_dropout(embedded)
# forward through unidirectional GRU
set_trace()
rnn_output, hidden_state = self.gru(embedded, last_hidden)
# calculate attention weights from the current GRU output
attn_weights = self.attn(rnn_output, encoder_output)
The error is occurring at the gru call after set_trace()
. Here is the error:
Traceback (most recent call last):
File "chatbot.py", line 110, in <module>
decoder_output, decoder_hidden = decoder(decoder_input, decoder_hidden, encoder_output)
File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/su0/pytorch-learn/chatbot_tutorial/decoder.py", line 99, in forward
rnn_output, hidden_state = self.gru(embedded, last_hidden)
File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 175, in forward
self.check_forward_args(input, hx, batch_sizes)
File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 131, in check_forward_args
expected_input_dim, input.dim()))
RuntimeError: input must have 3 dimensions, got 2
I’m only working a very small subset to make sure things are good. Here are the parameters:
batch_size = 5
hidden_size = 500
n_encoder_layers = 2
n_decoder_layers = 2
dropout = 0.1
attn_model = 'dot'
embedding = nn.Embedding(len(vocab), hidden_size)
Basically, my embedded
shape thats passed as input is of shape (batch_size, hidden_size) = (5, 500)
and its asking for a 3-d input. Which makes sense from the GRU documentation which says input is either of shape (seq_len, batch, input_size)
or a pad_packed_sequence
. However, here it is neither of those and I’d like some help fixing it. This code is directly from the tutorial.
There is one more error related to the use of pack_padded_sequence
which is also asked here. I’ve temporarily averted that by not setting a default device (and everything is running on the CPU) and will get back to it after I get everything else working.
Let me know if more information is required.
Thanks.