Hacker News new | past | comments | ask | show | jobs | submit login

Doesn't Python use UTF-16 internally?

IMHO, UTF-16 is the worst of both worlds. It breaks backwards compatibility in the simple case and wastes storage, but still has to have complex multi-byte decoding because it's not a fixed length encoding.

UTF-8 is probably the best compromise of the lot, with the advantages of UTF-32 being outweighed by the massive overhead in the most common case.




No. Python 2.7 uses UCS-2 or UCS-4 depending on how it was compiled. Python 3 uses ASCII, UCS-2 or UCS-4 determined at runtime per string depending on the string's contents.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: