Hacker News new | past | comments | ask | show | jobs | submit login

> Python2 way of dealing with Unicode was the most annoying way possible

The part about defaulting to ASCII is annoying, yes. And using sys.setdefaultencoding to change the default would still be annoying, yes. The reason for that is that any default encoding will be annoying whenever the actual encoding when the program is running doesn't match the default.

The correct way to fix this problem is to not have a default encoding at all. Don't try to auto-detect encodings; don't try to guess encodings. Force every encode and decode operation to explicitly specify an encoding. That way the issue of what the encoding is, how to detect it, etc., is handled in the right place--in the code of the particular application that needs to use Unicode. It should not be handled in a language runtime or a standard library, precisely because there is no way for a language runtime or a library to properly deal with all use cases of all applications.

What Python 3 did, instead, was to change the rules of default encodings and auto-detection/guessing of encodings, so that they were nicer to some use cases, and even more annoying than before to others.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: