Hacker News new | past | comments | ask | show | jobs | submit login

Yes, the problem is most translations aren’t literal one-to-one translations. Literal translations are usually hard to read so some translators use “dynamic equivalency” while others paraphrase heavily. Unfortunately this can make machine linguistic matching difficult and unreliable. For instance, there are two translations of the Nordic classic Kristin Lavransdatter which both read differently and there are people who will argue passionately about which translation is best.

Statistical machine translation uses curated parallel texts for training, but they tend to match with multiple corpora so the translation is some sort of average I believe. I wonder if matching with just one translation might produce less reliable results?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: