Issues with Chatbot NLP tutorial

I’m trying to build the Chatbot from the official tutorials and I’m running into two issues. The main issue is the mismatch of dimensions when running through the GRU in the decoder.

  def forward(self, input_step, last_hidden, encoder_output):
    # we run this one step (word) at a time
    embedded = self.embedding(input_step)
    embedded = self.embedding_dropout(embedded)

    # forward through unidirectional GRU
    rnn_output, hidden_state = self.gru(embedded, last_hidden)

    # calculate attention weights from the current GRU output
    attn_weights = self.attn(rnn_output, encoder_output)

The error is occurring at the gru call after set_trace(). Here is the error:

Traceback (most recent call last):
  File "", line 110, in <module>
    decoder_output, decoder_hidden = decoder(decoder_input, decoder_hidden, encoder_output)
  File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/su0/pytorch-learn/chatbot_tutorial/", line 99, in forward
    rnn_output, hidden_state = self.gru(embedded, last_hidden)
  File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/", line 175, in forward
    self.check_forward_args(input, hx, batch_sizes)
  File "/net/vaosl01/opt/NFS/su0/anaconda3/envs/pyt/lib/python3.7/site-packages/torch/nn/modules/", line 131, in check_forward_args
    expected_input_dim, input.dim()))
RuntimeError: input must have 3 dimensions, got 2

I’m only working a very small subset to make sure things are good. Here are the parameters:

  batch_size = 5
  hidden_size = 500
  n_encoder_layers = 2
  n_decoder_layers = 2
  dropout = 0.1
  attn_model = 'dot'
  embedding = nn.Embedding(len(vocab), hidden_size)

Basically, my embedded shape thats passed as input is of shape (batch_size, hidden_size) = (5, 500) and its asking for a 3-d input. Which makes sense from the GRU documentation which says input is either of shape (seq_len, batch, input_size) or a pad_packed_sequence. However, here it is neither of those and I’d like some help fixing it. This code is directly from the tutorial.

There is one more error related to the use of pack_padded_sequence which is also asked here. I’ve temporarily averted that by not setting a default device (and everything is running on the CPU) and will get back to it after I get everything else working.

Let me know if more information is required.

Embedding is multidimensional, so it shouldn’t have a dimension problem.
I tried running the jupyter notebook for the chatbot, and it seems to work.
Can you try running that and check the output?
Also what version of pytorch are you using?

Thanks for your reply. I ran the notebook really quickly and its training (on the GPU) without any problems. I don’t get why when I modularize the (same) code with different files and try to run one iteration, I get issues that I’ve highlighted my first post and the one that you had earlier.

I’m running PyTorch 1.0.1.post2

what is vocab here? There is no variable vocab in the tutorial

Well I guess I lied a bit about it being completely same code. Instead of voc, I named the instance vocab and I just gave the class a __len__ method.

why don’t you write a meaningful title for your question? What issue? Can I know your issue just by reading your question title?