Hacker News new | past | comments | ask | show | jobs | submit login

BMP + 16 planes.

You can blame UTF-16 for this mess. Unicode was originally meant to be able to encode two billions (2^31) characters. It bent over backwards to accommodate the limits of the bastard child that is UTF-16.

https://en.wikipedia.org/wiki/Plane_(Unicode)




Maybe they did it because Windows was backwards and used UCS-2, and later, UTF-16? If somehow, Windows managed to switch to UTF-8, I’m sure they (Microsoft) would mess it up and keep the 4 byte limit (imposed by Unicode) there even if it’s later removed (for backwards compatibility). What Microsoft really needs to do, IMO, is rewrite the Windows API to use UTF-8 or UTF-32. Make a `wwchar` type or something...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: