Some detail in the implement of CopyNet

CopyNet is a seq2seq-based model proposed in the ‘Incorporating Copying Mechanism in Sequence-to-Sequence Learning’.
The prediction can be seen as the combination of generating from the vocabulary and copying from the source.
I wonder how to deal with the sequence of a source. That is, how to index the vocabulary and the source especially the overlap of both sets?

You find this useful: I have an implementation of CopyNet built on top of AllenNLP here https://github.com/epwalsh/nlp-models