And before that, you get into encodings like EBCDIC, RAD50, SIXBIT, FIELDATA, an...

Someone · on April 3, 2012

"The world we're dealing with now on the Web begins with ASCII"

I know nothing about the implementation of early web browsers/gopher/etc, but I doubt there ever was anything on the web that used ASCII. 7-bit email may have been around at e time, but I would guess Tim Berners Lee just used whatever character set his system used by default (corrections welcome; being snarky isn't the only reason I write this)

flomo · on April 3, 2012

It was a hotly debated topic whether the www should use 7-bit/mime or not.

derleth · on April 3, 2012

> I know nothing about the implementation of early web browsers/gopher/etc, but I doubt there ever was anything on the web that used ASCII.

All headers, HTTP, email, or otherwise, are 99% or more ASCII. HTML markup is over 99% ASCII for most documents, especially the complex ones.

ASCII is the only text encoding you can guarantee everything on the Web (and the Internet in general, really) knows how to speak. Finally, guess what all valid UTF-8 codepoints in the range U+00 to U+7F inclusive are compatible with: ASCII.

TazeTSchnitzel · on April 4, 2012

ASCII in fact is the completely safe text encoding for HTML - and thanks to HTML entities, you do not lose any international character support. You can have a Unicode-using HTML document encoded in ASCII - it's just quite big.

Someone · on April 4, 2012

I know that, but "over 99% ASCII" = "not ASCII". For many users, UTF8 is over 99% ASCII, but it is not ASCII.

derleth · on April 4, 2012

> I know that, but "over 99% ASCII" = "not ASCII"

No, that's not what I meant. I meant that all of the essential bits are ASCII, all of the software that generates those important pieces as to know ASCII, and it's entirely possible for software that speaks only ASCII to handle it as long as the filenames (the main source of non-ASCII characters) being served are also ASCII.

Read the HTTP specification sometime.