Doesn't Python use UTF-16 internally? IMHO, UTF-16 is the worst of both worlds. ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

jandrese on June 19, 2016 | parent | context | favorite | on: UTF-8 Everywhere (2012)

Doesn't Python use UTF-16 internally?

IMHO, UTF-16 is the worst of both worlds. It breaks backwards compatibility in the simple case and wastes storage, but still has to have complex multi-byte decoding because it's not a fixed length encoding.

UTF-8 is probably the best compromise of the lot, with the advantages of UTF-32 being outweighed by the massive overhead in the most common case.

Avernar on June 19, 2016 [–]

No. Python 2.7 uses UCS-2 or UCS-4 depending on how it was compiled. Python 3 uses ASCII, UCS-2 or UCS-4 determined at runtime per string depending on the string's contents.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact