How to change GRU to LSTM in Chatbot Tutorial


(Steph Kua) #1

Hi guys,

I’m trying to use LSTM in the Chatbot tutorial provided by Pytorch. However, I’m currently facing an error shown in below.

RuntimeError: Expected hidden[0] size (4, 64, 500), got (2, 64, 500)

Any help would be much appreciated.

Thank you and have a nice day.


#2

The shape of hidden should not change swapping the LSTM for GRU.
Could you post your code here so that we could have a look?


(Steph Kua) #3

Thank you for your reply.

self.gru = nn.GRU(hidden_size, hidden_size, n_layers, dropout=(0 if n_layers == 1 else dropout), bidirectional=True)

original.

self.lstm = nn.LSTM(hidden_size, hidden_size, n_layers, dropout=(0 if n_layers == 1 else dropout), bidirectional=True)

swapped to LSTM.

The only thing that I change was nn.GRU to nn.LSTM for both EncoderRNN and LuongAttnDecoderRNN.

Original Code Link: https://pytorch.org/tutorials/beginner/chatbot_tutorial.html


#4

Thanks for the clarification.
I guess you are just changing the model without modifying the forward pass.
The error might be thrown, if you forget to pass a state tensor to the LSTM.
Have a look at the docs.
Since hidden and state are assumed to be passed as a tuple, hidden will be sliced in case you pass it without state, which yields this size mismatch error.