Pytorch seq2seq tutorial vocab missing

Hi, I’m woking with the tutorial of pytorch seq2seq and after go through the training progress then use the model to translate a random french sentence. It gives me error that word “Fruit” in sentence can not be translated. I understand that because word2index function can not convert “Fruit” to index. Are there any solution for this ? Thanks


Reference: NLP From Scratch: Translation with a Sequence to Sequence Network and Attention — PyTorch Tutorials 2.2.0+cu121 documentation

If you are dealing with unknown words, you could add a specific word index for these unknown words and change the indexesFromSequence method a bit to yield it as the default value:

def indexesFromSentence(lang, sentence):
    unknown_index = 0
    return [lang.word2index.get(word, unknown_index) for word in sentence.split(' ')]

Note that you would most likely have to adapt some other code parts to deal with this unknown index.