Hacker News new | past | comments | ask | show | jobs | submit login

> I know nothing about the implementation of early web browsers/gopher/etc, but I doubt there ever was anything on the web that used ASCII.

All headers, HTTP, email, or otherwise, are 99% or more ASCII. HTML markup is over 99% ASCII for most documents, especially the complex ones.

ASCII is the only text encoding you can guarantee everything on the Web (and the Internet in general, really) knows how to speak. Finally, guess what all valid UTF-8 codepoints in the range U+00 to U+7F inclusive are compatible with: ASCII.




ASCII in fact is the completely safe text encoding for HTML - and thanks to HTML entities, you do not lose any international character support. You can have a Unicode-using HTML document encoded in ASCII - it's just quite big.


I know that, but "over 99% ASCII" = "not ASCII". For many users, UTF8 is over 99% ASCII, but it is not ASCII.


> I know that, but "over 99% ASCII" = "not ASCII"

No, that's not what I meant. I meant that all of the essential bits are ASCII, all of the software that generates those important pieces as to know ASCII, and it's entirely possible for software that speaks only ASCII to handle it as long as the filenames (the main source of non-ASCII characters) being served are also ASCII.

Read the HTTP specification sometime.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: