WorldWideWeb: Proposal for a HyperText Project (1990)

russellbeattie · on Nov 12, 2022

(Self promotion, sorry, but relevant.)

I believe that the time has finally come to fulfill TBL's "Phase 2" so I created a prototype a few weeks ago. It's just an HTML editor made to create HTML Documents. Not web sites, not landing pages, not designs, not mockups, not apps, not games, just HTML based rich text documents with links and media you can edit and save locally.

https://www.hypertext.plus

gfodor · on Nov 13, 2022

I’m working on this, but for 3D.

The essential pieces you may be interested in:

- put all UI in shadow dom under the root, so inspector is exactly the html being edited

- use the file system access api to allow local writeback to file system while editing

- when on the web, connect to GitHub or via WebDAV to write back to origin while editing

- collaborate via crdts over webrtc. I wrote a cheap/free serverless signaling system for this: https://github.com/gfodor/p2pcf - you can use y.js to sync the document DOM

DM me on Twitter if you want to Collab - this particular thread is something I’m going to be pushing hard on but not for 2D HTML, but 3D HTML. Spatial 3D documents in HTML is likely going to be a major thing with the arrival of WebXR and spatial computing.

gala8y · on Nov 13, 2022

What do you mean by 3D HTML? That is very interesting. Your page goes to 503 atm.

gfodor · on Nov 13, 2022

Check this out: https://m.youtube.com/watch?v=CnUVCFdpVE4

gala8y · on Nov 13, 2022

MrVandemar · on Nov 13, 2022

A HTML Document standard would be great.

What is stopping you explicitly defining such a standard? Gruber did it with Markdown and that seems to have gone pretty well.

I use HTML for all my notes and writing, and I guess I have my own loosely defined "Standard" in the sense that I use a subset of tags, very light use of CSS, a predefined document structure, and I use the abbreviated model of HTML that doesn't require certain tags to be closed.

russellbeattie · on Nov 13, 2022

I'm working on exactly that right now! My next blog post will be a "proposal" for a standard, which I'm going to also post to the W3C working group forums to see if I can get traction from standards committees/browser engines.

What I'm working on now is the specific Sanitizer API options which explicitly list out the semantic HTML elements which are allowed, as well as the valid attributes. That will make implementing the standard pretty straight forward. However, the API doesn't cover style properties, so I'm wondering how deep I should delve into CSS, or just leave it alone. Not as clear cut as it seems because of @imports and fonts.

There's a bunch of this-or-that questions I'm trying to figure out like how restrictive should the Content Security Policy be? Will it allow external URLs for images/media, only secure sites, or be totally locked down to just data:// urls? What about file:// urls? Should I even use the Sanitizer API? If you open a document and there are extraneous elements, should they be ignored or deleted? An editor which potentially destroys a page isn't ideal. What about iframes? If you want to add a YouTube video, it's done via an iframe, so killing it all together doesn't seem like a great idea. Then again, that's a potential security/privacy issue. Etc. etc.

I'm leaning towards a restrictive, self-contained format which is as risk-free to implement as possible from a security and privacy standpoint. Basically re-creating how secure the .mhtml files are now. However, that's a lot less flexible than allowing external content to be loaded. But from everything I've seen online, any "standard" which isn't by default stupidly safe will be a non-starter from the browser makers perspective, and my ultimate goal is to have the doc format be something that can be natively opened. Even if I forgo that dream, it may just make sense from a practical standpoint - forcing the files to be useful only as documents. But then again... you get the idea.

I'm hoping to come up with a rationale for all the decisions, write as complete a spec as possible before opening it all up to input to avoid bikeshedding as much as possible. We'll see how it ends up!

BlueTemplar · on Nov 12, 2022

https://www.russellbeattie.com/notes/posts/why-is-markdown-p...

> Why don’t we have an HTML-based rich text standard yet?

> I have zero idea. In fact, we seem to be getting farther away from one as Markdown popularity surges.

> The other day I read a blog post by the folks at Mozilla - who are supposed to be the standard bearers of, um, web standards, and was truly shocked that they decided to convert all their documentation to Markdown last year. What?? In fact, they stopped using HTML in order to do it.

Yep, I am baffled by this too. Glad to see that I am not the only one !

I am aware of at least one document editor project that at least tries to have a working round trip conversion to HTML :

https://texmacs.org/tmweb/help/faq.en.html

(But I am afraid that it will never work as well as an editor that treats HTML as the first class document type...)

https://www.russellbeattie.com/notes/posts/the-decades-long-...

> But most importantly, since my goal was to replace Markdown as a simple, easily manageable rich text document format, trying to also tackle the bundling issue was besides the point. Markdown doesn't include files, doesn't care about them and will never have them. There's no reason for an HTML Document standard to deal with encapsulated files either, as essential as it might first seem.

> All that said, I'm sure someone, somewhere might wonder why my Hypertext editor doesn't have some sort of option to save as a zipped file format of some sort, I thought I'd share what I discovered reading up on the topic. Then maybe someone can get the W3C, Mozilla, Google and Apple to finally make up their minds and decide on a standard that we can use in the future. (I won't hold my breath.

Yeah, sigh...

Still, bundling is kind of essential - most documents will want to have at least some multimedia, and you generally do NOT want to have them get split into multiple files !

As you say I still don't understand why Firefox, instead of a separate folder for the non-HTML content, does not "throw a bunch of files into a zip and call it a day?", especially when ePub does that already !

russellbeattie · on Nov 13, 2022

> Still, bundling is kind of essential

Heh. I agree, actually. And even though I wrote all that, it was partly to convince myself to stop thinking about it. It didn't work.

Here's a prototype I finished last night for a secure, self-contained HTML doc format. It has a script at the top that monitors all the media tag additions like images, video and CSS, and then swaps them for base64 encoded data URLs contained at the bottom of the document. I say "secure" because the doc has a super-restrictive CSP at the top of the page which basically disables asset loading.

It's sort of a cross between MIME-HTML and SingleFile, but the end result is a page that's identical the original HTML, except for a bit of JS which I envision the browsers would take care of for .htmd files, and a chunk of JSON at the bottom for storage.

https://www.hypertext.plus/demo/htmldoc-test1.html

(BTW, Thanks for reading all my rantings! I appreciate knowing it's not going into the void!)

BlueTemplar · on Nov 13, 2022

Markdown implementations : "almost, but not quite, entirely unlike a document format"

;)

laerus · on Nov 13, 2022

How is this different from a WYSIWYG HTML editor?

russellbeattie · on Nov 13, 2022

It's basically an HTML word processor. It can create, save and edit .html files as documents. (Plus it has a bunch of other little features listed on the site, but that's the gist of it.) There are lots of WYSIWYG HTML editor libraries available but few, if any, rich-text editing apps that can read and write HTML/CSS files perfectly.

teleforce · on Nov 13, 2022

It's just amazing how far the Internet and its killer applications have come, and for some people it's all happening within one's lifespan.

The 1st gen killer app: Telnet to remotely access time-sharing systems on expensive workstations.

The 2nd gen killer app: Email to substitute snail mails.

The 3rd gen killer app: Web based distributed documents to allow collaboration between researchers and collaborators in different countries.

The 4th gen killer app: Instant messaging and VoIP to allow real-time global communication between anyone connected to the Internet.

The 5th gen killer app: Video conference to allow cloud based virtual meeting and distance learning especially during pandemic.

The 6th gen app: Still brewing but I think it'll be local-first applications to enable seamless collaboration and participation for everyone including those with unreliable intermittent Internet, minimum bandwidth and high delay connection [1].

[1]Local-first software: You own your data, in spite of the cloud

https://www.inkandswitch.com/local-first/

OnlyMortal · on Nov 12, 2022

Looks at SGML renderers of the day

I used to work on a Mac C/68k SGML editor. It was used for data capture of European Patents. ResEdit UI - god help us.

Combine those files with an example “Steven’s” TCP server and a client that renders, you have Web 1.0.

Technically, it was nothing novel - but it was given away for free.

There were many hypertext systems at that time that were proprietary.

dusted · on Nov 12, 2022

> 4 software engineers and a programmer

Interesting, I wonder what the exact distinction was at the time, in that particular context..

Since these days.. Well, my title is that of a sweng but I certainly spend a good amount of time programming.

recuter · on Nov 12, 2022

4 people to debate the merits of LED vs CFL, one to screw it in (probably an incandescent when research money starts running low). Same as today.

gala8y · on Nov 13, 2022

How many people do they need to change a lightbulb at CERN?

> 4 software engineers and a programmer

>> 4 people to debate the merits of LED vs CFL, one to screw it in (probably an incandescent when research money starts running low). Same as today.

z3t4 · on Nov 13, 2022

Early on an engineer wrote the code on a paper and a programmer then entered it into a computer. But as this is in the era of personal computers I'm also interested in the distinction.

ipython · on Nov 12, 2022

Ah. A project that actually changed the world for the better. Breath of fresh air after all the coverage on FTX.

imiric · on Nov 13, 2022

And yet it never completed the planned second phase, "allowing the users to add new material".

I think this was a crucial misstep of the early days, when the main focus was to build the best browser. This made the web entirely consumer focused from the start, which gave way for silos and advertisers to take over.

Imagine how different the modern web would've been if consuming content was as easy as publishing it, from the very beginning. There would've likely been tools built to give users control of the data they share publicly, and we could've ended up with a pull model, where companies _request_ access to user data, instead of there being massive central hubs everyone visits, that profit by tracking users, hoarding their data, and selling it on shady gray markets.

There have been some attempts over the years to do this. Most notably Opera with Unite in 2009, which failed miserably, and more recently the fediverse, which has yet to gain mainstream traction. I don't think that the technology is the problem in either case. It's just that the consumer web has so much momentum now, that there are no supporting services to allow the tech to propagate (e.g. most people still have asymmetrical internet connections, or NAT'd routers, which makes sharing difficult). And the UX needs to be on par with modern web browsers, which is a tall order, given the few decades of head start browsers have had.

I think we're just too late for that original vision to happen, but it's great reading that it was planned from the very beginning.

ipython · on Nov 13, 2022

Au contraire. The early web was chock full of original user content. People with shell access published all sorts of stuff in their Unix home public_html directory. Those who did not had geocities accounts.

Today, it’s easier than ever to publish. Anyone who wants to create content can do so.

imiric · on Nov 13, 2022

> People with shell access published all sorts of stuff in their Unix home public_html directory. Those who did not had geocities accounts.

Exactly my point. Only technical users knew how to publish, and those who didn't had to use 3rd party publishers. Which is still the case today.

What I'm talking about is making publishing web content as easy as consuming it. Publishing without a 3rd party to generate and host the content is still as difficult as ever. The modern web is entirely consumer focused, where large companies have stepped in to fill the gap of facilitating non-technical users to publish content.

Imagine if we had a tool to do that that was as ubiquitous as the web browser is today. From these early design documents, it's clear that was the intention, but for some reason, it never gained traction.

ipython · on Nov 13, 2022

Geocities and other public sites like it filled the void for people who did not have access to a shell account or didn’t have the skills to ftp files to one.

As for html editors, frontpage, HoTMetaL, and eventually even Netscape itself (in version 3) were very popular wysiwyg html editors released in the mid to late 1990s. There is still innovation in this space as evidenced in this very thread! https://news.ycombinator.com/item?id=33577086

No non technical user wants to think about hosting. Especially in the mid 1990s when home internet was exclusively dialup, what value does holding your one phone line hostage 24x7 to host a web server provide you? So yes hosting has been and will always be the playground of the enthusiast and commercial entities.

Self hosting made sense for academics as they had access to free high speed internet, free electricity and free high speed workstations, as well as the technical know how or easy access to those who could help.

imiric · on Nov 13, 2022

> Geocities and other public sites like it filled the void for people who did not have access to a shell account or didn’t have the skills to ftp files to one.

Again, you're echoing my point, while trying to argue against it(?).

WYSIWYG editors came too late, and were woefully insufficient, far from user friendly, and still required technical know-how. A testament to that is the fact they died out, and were replaced by site builders and hosters, which still exist today.

> No non technical user wants to think about hosting.

Right. The fact "hosting" is in the modern vernacular is an indication that this is a failure of early web designers. End users shouldn't have to think about HTML and "hosting" content, just as they don't think about HTML and consuming it today. A web browser is simply a tool they use to get "online", which is only a part of what's actually possible with the web.

> Especially in the mid 1990s when home internet was exclusively dialup

I agree with you there. The infrastructure didn't exist for everyone to run a web server from home. But the web is what drove ISPs and the general move towards broadband. Had there been an early push towards self-hosting as much as there was for consuming, you can be certain that ISPs would've been forced to deliver a service that was equally capable of serving content, and technology would've been created to work around the limitations of dialup. Maybe we would've seen earlier broadband adoption, or at the very least, symmetric connections would've been standardized.

Besides, none of that was relevant at the turn of the century, when broadband was seeing mass adoption. If self-hosting hadn't taken off by then, that could've been the turning point. But by that time, tech giants were starting to be established, and millions of people were getting online to an already established web.

Regardless, there's not much point in discussing _why_ this didn't happen. I'm only lamenting it didn't, and theorizing about what might've been.

retreatguru · on Nov 13, 2022

Potentially, HyperText provides a single user-interface to many large classes of stored information such as reports, notes, data-bases, computer documentation and on-line systems help.

Insightful understatement of the century.

thrown_22 · on Nov 12, 2022

One thing I've noticed more and more is that I've been trained to _not_ click links in a page but search for the same thing in google. Hyperlinks are now the navigation bar inside a website and little else. Checking the last 10 pages I have opened the majority of them from reputable sites either don't have links or don't have useful links.

It's kind of bizarre that the main selling feature of HTML is basically lost today.

czx4f4bd · on Nov 12, 2022

I think you touch on a valid point about changes in the way content is presented online, but I also think you're really underselling just how utterly transformative the Web has been.

Like, you say you don't click links because you use Google instead... but Google search returns links. It could not even function without links. Keyword search is even one of the listed goals in the proposal, because prior to the Web, there was no singular place to find and retrieve information like that:

> At CERN, a variety of data is already available: reports, experiment data, personnel data, electronic mail address lists, computer documentation, experiment documentation, and many other sets of data are spinning around on computer discs continuously. It is however impossible to "jump" from one set to another in an automatic way [...] Usually, you will have to use a different lookup-method on a different computer with a different user interface. Once you have located information, it is hard to keep a link to it or to make a private note about it that you will later be able to find quickly.

That's how utterly different the world was without the Web. You couldn't open ten pages from different sources in one application and then Google for something in a new tab. Every single way of accessing information would've been its own distinct application, without much overlap or interoperability between them. Any attempt to build a search engine like Google or a content aggregator like HN would've been stymied by the sheer variety of formats and standards for presenting information.

a9h74j · on Nov 13, 2022

> At CERN, a variety of data is already available

This proposal was more CERN-focussed than I had imagined. That said, my search came up negative on each of 'math', 'formula' and 'equation'.

Was this a strategic omission to make the project more tractable? I would not be surprised had the proposal been directed to a non-science community, but at CERN?

czx4f4bd · on Nov 13, 2022

I think it was just too early in the project for that. Images were still considered a nice-to-have and (as far as I can tell) HTML hadn't even been proposed yet. It mentions the potential for a markup language, but it also says that they don't want to force users to adopt any particular markup language and mentions word processed documents as a potential node format. Perhaps the assumption was that specialized formatting requirements would be handled by specialized document formats, with markup pages acting as glue for listing and navigating between those formats (similar to how most browsers now have built-in PDF readers).

foobarbecue · on Nov 12, 2022

I keep reading this but can't quite parse it. What do you mean by "Hyperlinks are now the navigation bar inside a website and little else"? Do you mean "URLs are shown in browser navigation bars but people do not click hyperlinks in websites"?

sirmarksalot · on Nov 12, 2022

Meaning that the main viable use case for them now is intra-site navigation, i.e. the menu bar at the top of the page.

blowski · on Nov 12, 2022

So the observation here is that Tim Berners Lee's original vision was all these documents that would be linking to each other, across different hosts. But today's websites are mostly silos that only link inwards. Have I got that right?

jbverschoor · on Nov 13, 2022

In the paar online text would be filled with more links and references. Wikipedia is still like that.

The normal use case are navigation, and call to action.

The web has become a kiosk with virtual magazines screaming for attention Commented with an application distribution platform (html/js)

We used to have applications and protocols. Now we have something similar to a mainframe, but the client app is distributed over and over again.

Forum? NNTP

Chat? IRC

Dole transfer? FTP

Status updates? Finger

candiddevmike · on Nov 12, 2022

IMO Google has been slowly condensing/removing the address bar in Chrome so people think Google Search == web, and I think your anecdote shows that it's working.

layer8 · on Nov 12, 2022

> I've been trained to _not_ click links in a page

You don’t click links on the Google results page?

kzrdude · on Nov 13, 2022

I wish google had clean links on their result page, so I could right-click and copy URL cleanly. But no, it is a tracker link

layer8 · on Nov 13, 2022

Yeah, that’s annoying, but there’s browser extensions like ClearURLs to care of that.

kzrdude · on Nov 13, 2022

Thanks that's a good tip, seems to be a good addon for firefox. Annoyingly looks like that addon also has some scope creep (default domain blocking, why, I have ad blockers for that?)

thrown_22 · on Nov 13, 2022

Another part of the training I'm talking about. You need an add-on to get basic functionality from the 90s back.

flomo · on Nov 13, 2022

Replying to say that I just noticed this today, and decided it was a negative tendency. I was reading a history article on The Athletic and was wondering about some things. Only after mulling a bit, I noticed they were links to some good youtube videos.

julienreszka · on Nov 12, 2022

What?

pbreit · on Nov 12, 2022

Also the primary input to Google PageRank.

thrown_22 · on Nov 12, 2022

Page rank hasn't been used as the google algorithm for over 15 years now.

zozbot234 · on Nov 12, 2022

What's this Hyper-Text stuff all about? Is this supposed to be about some weird decentralized version of Obsidian and other knowledge management apps? It all seems really hacky and clunky, because of this silly one-size-fits-all and decentralization stuff they keep harping about. Though the proposals at the end about providing an automated 'view' over existing databases are intriguing.

quesera · on Nov 12, 2022

As reviewed on USENET at the time:

> No federation. Less readers than alt.swedish.chef.bork.bork.bork. Lame.