Hacker News new | past | comments | ask | show | jobs | submit login

I thought a case-insensitive email address can be compared by using pseudocode lower(x)

You shouldn't be comparing the mailbox part of email addresses at all other than as literal bytestrings: you cannot know what equivalence rules the mailserver for that domain uses.

The domain part can be equivalence-tested using the normal rules for domains though, including case insensitivity, IDN translation and punycode resolution.




> you cannot know what equivalence rules the mailserver for that domain uses.

You're right in general. In my specific post, I do know the equivalence rules, because I'm administering the mail system and working with its source code, and the documentation guarantees/requires that internally its email addresses are treated entirely as case-insensitive.

What I saw was source code comments about not using `lower(x)` nor Postgres module `citext`, and instead using Unicode case folding ICU and Postgres non-deterministic collations. In the end, what surprised me wasn't about email servers in general, it was about human languages with case folding.


> You shouldn't be comparing the mailbox part of email addresses at all other than as literal bytestrings

It's hard to know what to do in practice, but this seems to be wrong according to https://www.rfc-editor.org/rfc/rfc6532#section-3.1: "normalization form NFC SHOULD be used".




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: