I'm surprised there's no mention here of punycode and IDNs. Is there just not enough widespread support for IDNs (the article mentions a need for plugins, but I thought most browsers already support IDNs), or are numbers really that much easier to type and remember than short, meaningful domains in native characters?
As I understand, punycode is translated by the browser to a standard ASCII string, so there's no need for special support in the DNS system or other infrastructure other than a simple translation in the browser's URL handling. Seems like a pretty straightforward/simple solution.
While it doesn't mention punycode directly, it has paragraph referring to IDN:
Why don’t Chinese web addresses just use Mandarin characters? Because that’s a pain, too. The Internet Corporation for Assigned Names and Numbers (ICANN), which sets the rules for web addresses globally, has periodically hyped the expansion of domain names to include non-Latinate scripts, but Chinese web sites have yet to take full advantage. Some devices require a special plug-in to type in Chinese URLs, and even then it takes longer to type or write out characters than to input a few digits. Plus, for web sites that want to expand internationally but don’t want to alienate foreign audiences with unfamiliar characters, numbers are a decent compromise
Sounds like the core problem is just "Chinese web sites have yet to take full advantage".
I'm really wondering how many (or what percentage of) devices actually need "a special plug-in". Seems like it would be a very small - and shrinking - fraction, but I could be wrong.
If the user's device is (presumably) already set to use Mandarin (or any specific language), then I assume that would be the default input method, so why does that take significantly longer to use than numbers? I'm genuinely curious, never having experienced that use case myself as a native english speaker.
As for websites wanting to expand internationally and avoiding unfamiliar characters, that doesn't really make sense to me - if they want to expand internationally, I don't think the numeric domains are going to help much (they'd be just as cryptic to me as Mandarin characters), so why wouldn't they just register alternate domains in the target regions/languages as many already do?
>they'd be just as cryptic to me as Mandarin characters
Yes the meaning would be just as cryptic, but the act of typing and probably memorization would be much easier. As an exercise, try downloading an input method for a foreign character set and typing out a string that you see rendered in an image.
When typing Chinese characters, each character takes multiple key strokes. If each character is a numeral, it's one keystroke per character. It's like an acronym.
Interesting. From what I can tell though, the lack of support from those TLDs seems to be deliberate rather than due to any technical challenge (the domain you register ends up just looking like standard alphanumeric ASCII with some dashes). I may be wrong, but it probably requires more effort for a TLD to disallow IDNs as they'd have to implement a test to try and detect a punycode string within the ASCII.
The one technical challenge I suppose could be a concern would be optionally implementing some sort of test to detect domain registration attempts that use similar-looking characters to those in existing legitimate domains (e.g., registrations from scammers attempting phishing attacks with domains that look similar to other domains but use different extended characters).
Still, it feels like the lack of support from these TLDs is more a matter of policy than a technical challenge.
> The one technical challenge I suppose could be a concern would be optionally implementing some sort of test to detect domain registration attempts that use similar-looking characters to those in existing legitimate domains (e.g., registrations from scammers attempting phishing attacks with domains that look similar to other domains but use different extended characters).
Another thing to mention, for elder generations like my father, even if you can use Chinese character as URL, they don't know how to input it, because there's IME needed, nothing like native Americans learn how to input ASCII. Obviously, ASCII are printed on most keyboards.
As I understand, punycode is translated by the browser to a standard ASCII string, so there's no need for special support in the DNS system or other infrastructure other than a simple translation in the browser's URL handling. Seems like a pretty straightforward/simple solution.