Hacker News new | past | comments | ask | show | jobs | submit login

The other side of that is handling case insensitivity in Unicode bug for bug compatible with email providers.



> handling case insensitivity in Unicode bug for bug compatible with email providers.

The official email standards basically say to treat email addresses as a binary format. You aren't even allowed to do NFC / NFD / NFKC etc normalization.

https://github.com/whatwg/html/issues/4562#issuecomment-2096...

Unicode has some standards which are slightly better, but they're only for email providers to restrict registering new email addresses, and it still doesn't suggest case-insensitivity.

https://www.unicode.org/reports/tr39/#Email_Security_Profile...

I'm tempted to write an email standard called "Sane Email" that allows providers to opt into unicode normalization, case insensitivity (in a well-defined way), and sane character restrictions (like Unicode's UTS #39).

Currently the standards allow for pretty much _any_ unicode characters, including unbalanced right-to-left control characters, and possibly even surrogates.

Websites are supposed to store email addresses as opaque binary strings.

I think the overly permissive standards are what are holding back unicode email addresses.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: