Hacker News new | past | comments | ask | show | jobs | submit login

> I could say that if you’re handling multilingual text, then you should damn well know how multilingual text works, that it’s not peripheral to your problem.

Trouble is, anyone using Unicode and accepting user inputs is effectively handling multilingual text, unless they explicitly filter it out. Which includes the vast majority of websites and even web-based user interfaces for standalone hardware.

> As far as I know, this is not solvable.

I am sure it is solvable in the sense that it is possible to make the behaviors less surprising and complicated without sacrificing people's ability to use right-to-left languages. There would have to be a discussion about underlying assumptions and real-life usage to achieve that, however.

Generally, though, I don't see legitimate use for ever reversing left-to-right languages when displayed to user. That's not what anyone would expect, not even the writers of right-to-left languages. And the myriad of malicious uses are kind of obvious. And the long-term effect of people abusing these will be websites banning more control characters, which will affect users of Arabic and Hebrew.

Also, with the way Unicode is being developed it is increasingly unclear what "plain text" even means these days. AFAIK, there isn't even a formal definition of that term. Maybe that's where the discussion should really start. What capabilities separate "plain text" from other things?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: