Hacker News new | past | comments | ask | show | jobs | submit login

The reason for it were bad decisions made in older version of python. The problems were:

- conflating text with bytes, python had no way to tell whether given string is a text or bytes, because in 2.7 was the same thing.

- introducing unicode as an unicode type, this essentially made the problem worse because in addition to mixing text with strings, they added extra type to represent text, which was optional and some people used unicode some didn't.

- a cherry on top was implicit conversion, so if you passed unicode type where str was expected python implicitly converted it, and vice versa

This basically resulted in people writing a broken code in python 2, code that worked fine with us-ascii, but randomly blew up if there ware some non standard characters processed.

In python 3 instead applying another fix, they decided to do it correctly from the start. So str (Guido actually regretted he didn't called it text) is representing text and bytes are representing, well bytes. Python 3 also does not do implicit conversion as well.

I actually think they could perhaps help with migration, by adding to 2.7 one more import to __futures__ that disables implicit conversion and then treat bytes type as a distinctly different type than str. That could help people fix their code in 2.7 before migrating it, but anyway it's already too late to do it, also there is a hack that you could disable implicit conversion in python 2, but it showed that stdlib also relied on it heavily, so fixing that maybe was not worth it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: