I’m trying to build the Chatbot from the official tutorials and I’m running into two issues. The main issue is the mismatch of dimensions when running through the GRU in the decoder.
def forward(self, input_step, last_hidden, encoder_output): # we run this one step (word) at a time embedded = self.embedding(input_step) embedded = self.embedding_dropout(embedded) # forward through unidirectional GRU set_trace() rnn_output, hidden_state = self.gru(embedded, last_hidden) # calculate attention weights from the current GRU output attn_weights = self.attn(rnn_output, encoder_output)
The error is occurring at the gru call after
set_trace(). Here is the error:
Traceback (most recent call last): File "chatbot.py", line 110, in <module> decoder_output, decoder_hidden = decoder(decoder_input, decoder_hidden, encoder_output) File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/home/su0/pytorch-learn/chatbot_tutorial/decoder.py", line 99, in forward rnn_output, hidden_state = self.gru(embedded, last_hidden) File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__ result = self.forward(*input, **kwargs) File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 175, in forward self.check_forward_args(input, hx, batch_sizes) File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 131, in check_forward_args expected_input_dim, input.dim())) RuntimeError: input must have 3 dimensions, got 2
I’m only working a very small subset to make sure things are good. Here are the parameters:
batch_size = 5 hidden_size = 500 n_encoder_layers = 2 n_decoder_layers = 2 dropout = 0.1 attn_model = 'dot' embedding = nn.Embedding(len(vocab), hidden_size)
embedded shape thats passed as input is of shape
(batch_size, hidden_size) = (5, 500) and its asking for a 3-d input. Which makes sense from the GRU documentation which says input is either of shape
(seq_len, batch, input_size) or a
pad_packed_sequence. However, here it is neither of those and I’d like some help fixing it. This code is directly from the tutorial.
There is one more error related to the use of
pack_padded_sequence which is also asked here. I’ve temporarily averted that by not setting a default device (and everything is running on the CPU) and will get back to it after I get everything else working.
Let me know if more information is required.