I tried the demo of “eng-fra” translation.
I want to implement this algorithm on another area.
My question is what should I do when I try to translate new English sentence when it has new word which doesn’t exist in the original English sentences?
This is a hard problem for NLP when a new word is not in your vocabulary. Take a look at subword method (like SentencePiece). It may be helpful. Some pretrained embedding may also help.
here is an example.
This file has been truncated.
"# 3 - Neural Machine Translation by Jointly Learning to Align and Translate\n",
"In this third notebook on sequence-to-sequence models using PyTorch and TorchText, we'll be implementing the model from [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/abs/1409.0473). This model achives our best perplexity yet, ~27 compared to ~34 for the previous model.\n",
"As a reminder, here is the general encoder-decoder model:\n",
"In the previous model, our architecture was set-up in a way to reduce \"information compression\" by explicitly passing the context vector, $z$, to the decoder at every time-step and by passing both the context vector and input word, $y_t$, along with the hidden state, $s_t$, to the linear layer, $f$, to make a prediction.\n",