ref : https://pytorch.org/tutorials/beginner/chatbot_tutorial.html?highlight=chatbot
# Forward through unidirectional GRU
rnn_output, hidden = self.gru(embedded, last_hidden)
# Calculate attention weights from the current GRU output
attn_weights = self.attn(rnn_output, encoder_outputs)
Based on Luong et al attention model, when calculate attention weights,
don’t we have to put hidden(from the gru above) and enconder_outputs for self.attn? like below code
# Forward through unidirectional GRU
rnn_output, hidden = self.gru(embedded, last_hidden)
# Calculate attention weights from the current GRU output
attn_weights = self.attn(hidden, encoder_outputs)