Hacker News new | past | comments | ask | show | jobs | submit login

There are many languages that work similarly, and often it's more the orthography than the grammar that's problematic. E.g. English and German form compund nouns in the same way, but the constituent parts are usually separated by spaces in English orthography, while they're run together as one long string in German.

That doesn't mean it's impossible to work with other languages, just that "words separated by spaces" is the wrong abstraction for processing them. It just happens to be a heuristic that works well enough for English, so a lot of functionality (like autocomplete) assumes that it works the same for other languages. It would be perfectly feasible to offer partial completions of long words in languages like Finnish or German, if only the space key were treated as less special. (Just compare to Chinese and Japanese, where autocomplete works despite no spaces at all.)

Not having a static form might create some redundancy in the lexicon, but that's not more of a problem than the vowel mutation in English "sing", "sang", "sung", "song". Treating different surface realizations of the same underlying base form as independent might actually be beneficial for getting accurate results that take into account how the base form is modified.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: