I have an MTN project based on TRANSFORMER everything is working fine, for the final translation output the model translates number to zeros
0000, and non-English names to
I need to solve these issues to increase my blue score. any suggestions?
Rare words or out-of-vocabulary words are a fundamental challenge for NMT. You still find very recent academic papers addressing this.
For example, for a very simple NMT task, I used an off-the-shelf NER system to replace, say, person names. So 2 sentences “I met Alice” and “I met Bob” would be converted to "I met "; same for the target sentences. After the translation, I would simple replace with the actual name. Replacing numbers with would also be very easy with a RegEx. It worked fine enough for my use case, but it’s probably too naive for the general case.
Thank you Sir @vdw , That what I’m looking for.