Hacker News new | past | comments | ask | show | jobs | submit login
Unicode In Python, Completely Demystified (farmdev.com)
10 points by ivankirigin on March 25, 2008 | hide | past | favorite | 1 comment



Punchline: "decode early, unicode everywhere, encode late."

There're actually more caveats to unicode, especially if you have 3rd party code doing the decoding and encoding. There are ways to manufacture strings that are broken (not valid unicode for display, not valid __-encoded ascii), and if it happens, there's actually no way to recover inside of your app.

I'd love to see a more in depth treatment of the different decode modes ('replace', 'ignore', and 'strict') and how they can screw up your data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: