The pytorch tutorial on seq2seq translation mentions something about the decoder. It says
In the simplest seq2seq decoder we use only last output of the encoder. This last output is sometimes called the context vector as it encodes context from the entire sequence. This context vector is used as the initial hidden state of the decoder.
The documentation of nn.GRU mentions that it returns β output, h_n
Which one these (output, h_n) signify context vector?
I am using GRU for classifying sentences between two classes, say between positive and negative class. Which one of (output, h_n) would be useful for this classification?