The other side of that is handling case insensitivity in Unicode bug for bug com...

collinmanderson · 2024-10-12T00:01:55 1728691315

> handling case insensitivity in Unicode bug for bug compatible with email providers.

The official email standards basically say to treat email addresses as a binary format. You aren't even allowed to do NFC / NFD / NFKC etc normalization.

https://github.com/whatwg/html/issues/4562#issuecomment-2096...

Unicode has some standards which are slightly better, but they're only for email providers to restrict registering new email addresses, and it still doesn't suggest case-insensitivity.

https://www.unicode.org/reports/tr39/#Email_Security_Profile...

I'm tempted to write an email standard called "Sane Email" that allows providers to opt into unicode normalization, case insensitivity (in a well-defined way), and sane character restrictions (like Unicode's UTS #39).

Currently the standards allow for pretty much _any_ unicode characters, including unbalanced right-to-left control characters, and possibly even surrogates.

Websites are supposed to store email addresses as opaque binary strings.

I think the overly permissive standards are what are holding back unicode email addresses.