Does anyone think it would make sense to create a drastically simpler set of web standards, so that making web browsers would become much simpler?
Such a simpler web spec would be relatively fast moving, not focused on backwards compatibility, but instead on simplicity of implementation. HTML would have to be written correctly (eg. balanced tags), old styling mechanisms would be removed so that layout engines wouldn't have to accommodate them. Everything would be pared down.
I believe this would open the playing field for many people to create browsers, would breath life into the now basically empty browser space and the Web in general.
Of course adoption would be a big issue, but that's always a big issue. I wonder why this wouldn't make sense to try, given the current state of affairs. It doesn't make sense to just give up on the Web. Why not re-invent it a litte?
a drastically simpler set of web standards, so that making web browsers would become much simpler
Yes, yes, yes!
would be relatively fast moving, not focused on backwards compatibility
No, no, no! Constant churn is precisely the problem with the web today, as it is what creates all that complexity and bloat. What you really need is a simple and stable set of standards, ideally something that won't change in decades (somewhat like how ASCII has been) so that any implementors don't have to engage in mindless trendchasing.
In fact, we already have a simpler set of web standards. It's called HTML4 and CSS2. Browsers like Dillo and NetSurf handle them well, and site like HN and Craigslist are an example of what the resulting format is like.
Unfortunately, HTML4 and CSS2 are severely underspecified, so actually implementing them interoperably without reverse-engineering is impossible. Oh, and some places where they _are_ clearly specified that specification is more or less broken. For example, implementing comment parsing per the letter of the HTML4 spec is extremely not-web-compatible, and I doubt that either Dillo or NetSurf do it...
Now if you know which things to avoid (e.g. never put "--" inside your comments) and don't care about "pixel-perfect" rendering or any sort of interesting layout, HTML4 and CSS2 are not terrible. But if you care about any of that, watch out for dragons.
And before someone brings up "tables" for "interesting layout": table layout is unspecified. In CSS2, and CSS3 for that matter. Not only is it unspecified, it's not entirely interoperable across browsers even now, after literally decades of reverse-engineering each other. And for extra fun, WebKit/Blink's implementation is definitely not interoperable with the IE (Trident) implementation most table-based layouts targeted... As one example, changing the order of rows in the table can change the column widths in Blink but not in Trident.
Anyway, if one wanted to start with HTML4 and CSS2, one _could_ try to turn them into proper standards that can be interoperably implemented. It would take quite a lot of effort to do that, I suspect. 50 person-years is my initial guess, but there are a _lot_ of unknowns involved and a lot would depend on how much of the HTML5 and CSS-post-2 work that defined things rigorously could be leveraged.
and don't care about "pixel-perfect" rendering or any sort of interesting layout
A common theme in all these "reinvent the web/browser" discussion is going back to the web as a hyperlinked document library and not an application platform, in which case pixel-perfect rendering is neither necesary nor even a goal.
For example, implementing comment parsing per the letter of the HTML4 spec is extremely not-web-compatible
HTML5 parsing is completely specified and definitely compatible, even the error cases. Any stream of bytes will turn into a DOM. (Philosophical question: are they even errors anymore, if all implementations will produce the same output?) Perhaps that would be a good starting point.
> A common theme in all these "reinvent the web/browser" discussion is going back to the web as a hyperlinked document library and not an application platform, in which case pixel-perfect rendering is neither necesary nor even a goal.
And how exactly would one even put this genie back in the lamp?
Which Markdown? There are so many to choose from....
In seriousness: Most Markdowns are 1) fairly similar and 2) sufficient for the vast majority of documents. If the goal is a stable, mature, and complete markup language, I'd be inclined to give LaTeX top billing. Markdown can of course generate LaTeX.
(La)TeX is a bad fit because its document model is based on paginated documents with a fixed page size, whereas HTML documents are intended for variable viewport size. LaTeX is to HTML as PDF is to EPUB.
LaTeX can, whether through the old model of dvi, or modern tools such as xlatex and pandoc, directly produce numerous document formats or "endpoints" as I consider them, including HTML, ePub, plain ASCII (or UTF-8) text, or paginated formats including ps, PDF, and djvu. LaTeX is not itself fundamentally print-oriented. The fact that it can and does produce excellent print-formatted output is a feature, not a bug.
What it is, and pointedly in ways that HTML lacks, is capable of intrinsically handling document-centric (not merely "print") elements including footnotes, endnotes, and formulae, all of which still require kludges after over a quarter century on the Web.
Markdown itself does not address several typographic or document conventions, including formulae, but also odd omissions such as underline and coloured text. Whether those get shimmed into Markdown, or an alternate (light- or heavy-weight) markup language is adopted, isn't clear, but those are very annoying lapses.
For the vast majority of documents, this does not matter. Most online content, say news media, use little more than paragraph, italic, and anchor elements. Even bold and list are rarely used. Authoring in Markdown should be almost wholly sufficient, but it's (La)TeX which has sufficient richness of expression to serve as the common underlying document format language.
Late edit: It also occurs that another principle angle of attack on HTMK alternatives, raised elsewhere in this thread, is that these cannot guarantee pixel-perfect presentation. That results in rather a "damned if you do, damned if you don't" situation: propopsed markup alternatives either cannot guarantee layout or over-guarantee layout. These objections rather want for consistency.
And how would one exactly get users to actually go with less featured browsers that only show hyperlinked documents, rather than sticking to the jack-of-all-trades browsers that they are using today?
> HTML5 parsing is completely specified and definitely compatible, even the error cases.
Counter argument: then why do conditional comments behave differently in each browser engine?
I am not talking about trident, but about CSS hacks for presto, gecko, webkit and blink as well.
If every browser would render as specified, we wouldn't have that outcome.
As developers test on webkit/blink primarily, chances are very likely things will not behave the same in other engines, and if blink violates the spec then everybody will also have to violate the spec.
The internet is built on such bad standards that you cannot even rely on HTTP to work correctly. 206 partial content headers behave differently among all web servers and proxies, and even nginx violates the spec there when it comes to multiple ranges, let alone chunked transfer encoding support.
Regarding your philosophical question: a markup language that accepts any byte sequence is clearly useless and a travesty of the concept of markup languages as an authoring format.
A little fact I wanted to add here: tables are unspecified when it comes to their display model. All display models have been changed to reflect the flow model (e.g. display: inline-block means display: inline flow-root behind the scenes).
But the funny thing is: they forgot to specify display: table and everything in it.
If you're interested in all the values that are only buried down somewhere in the specs, I'm building a CSS parser [1] that probably will never be completed.
HTML4 is specified as SGML vocabulary so I don't see the problem with parsing it, especially if you leave out the script element which introduced irregularities. Yes SGML was seen as complex in 1996, but it's relatively sane compared to the 2020's web stack. Developing a core SGML parser (with mandatory automaton construction and tag inference) can be done in about 0.5 man-years. And developing a CSS 2 renderer should be possible in less than 49.5 years.
Which of the view tens or so ascii's you mean? Probably us-ascii right? I.e. a thing so limited that it's only supports english and not even most of the other Latin writing based languages.
On the other side if we speak about plain text it hasn't be that stable at all for many years and only somewhat stabilized now with Unicode + utf-8 and utf-16 for legacy reasons. And even now we still frequently get Unicode updates.
The idea that us-ascii is enough/usable/acceptable for anything interfacing with users is IMHO a bubble limited to (small? part of) IT people from certain english native countries.
> In fact, we already have a simpler set of web standards. It's called HTML4 and CSS2.
CSS2 isn't simple. It's also fundamentally unsuited for web applications, which IMHO was still also true for CSS3 until recently (css gid). I mean think about how frameworks (e.g. bootstrap) for years did all kinds of tricks to emulate a css gird like features.
Also HTML5 tags like header/footer/article etc. are a must have IMHO and something like custom elements for better composition and reuse are a must have, too.
The problem with the current complexity of the web lies in my opinion in the combination of how all kind of features where bolted on top of a foundation which wasn't designed for given use cases and many of this features being over engineered.
So I believe such an approach needs to fundamentally revamp or replace both the DOM API and CSS.
(But honestly tables suck bad time, I did wrote some table base hobby website back in 2009? or so and it wasn't nice (the experience, not the website which was quite fine). Sure basing a GUI on a grid is the best thing to do in many cases but tables are no grids. Grids are more flexible.)
I’m not defending table based layout but in fairness different device sizes was less of an issue back then because almost all browsing was done on a PC or Mac and thus dynamic layout wasn't even something you needed to considerate.
> different device sizes was less of an issue back then because almost all browsing was done on a PC or Mac and thus dynamic layout wasn't even something you needed to considerate.
It saddens me when Web sites assume they are in a full-screen window :(
Impressive that you remembered having two sets of nested parentheses going by the end of your comment and closed them correctly. Perhaps that's part of the problem...
I never really understood what people have against float. I think it works fine for most use cases and is not to difficult.
Now table layouts where quite a pain because they got very complex very fast. Flexbox and Grid are fine I guess, but I always found them a bit harder to understand than float and did not so much they offered that I needed.
This is true, <table> elements are for tabular data. But look at almost any web layout, and you will notice things are in a table layout most of the time. Even flexbox conceptualizes the flow of children as flex-direction: row/column. I think tabular concepts like rows and columns just make sense to humans making websites, and our 2D x/y axis conditioning.
The real issue with using <table> is semanticism, breaking DOM flow (sometimes creates issues for screen readers), and separation of concerns wrt data and style like you mentioned. Also, <table>s are hard to style over, like wtf is display: table-cell? Nobody seems to know.
But the number of times I see a colleague or fellow frontend cretin re-creating a tabular interface with a bunch of <div class="row"> etc... or wondering how to dynamically size the nested columns to fit the largest cell, I remind them: just use a table. Please.
You might notice that Hacker News layout uses <table>.
All I can say is whatever, man. The semantic web and all the RDF tuple goodness we were supposed to get is mostly a dead dream. Make whatever works for your users. Accept that things aren't going to be pure and perfect. If tables gets me to where I need to be, then that's what I'll use. Worked 20 years ago, works now, will work 20 years from now too.
Bring back DSSSL. After all, what language has a better ratio of fundamental simplicity to expressive power than Scheme? Much of the emergent complexity of styles is due to CSS selectors being intentionally non-Turing-complete.
Ah, that’s meant as a qualitative figure of speech, not a measurable objective function. Intended to express that Scheme is a simple language that remains apparently simple even when you build tremendously powerful constructs within it and upon it.
In this case by contrast with CSS where no matter how much sophistication one tries to introduce into the styles its capabilities seem to be horizontally asymptotic or very substantially sublinear.
I reach for the word ratio because the concept (of two parameters whose magnitude varies and that would be meaningful in some relation) is a friend, and the word “better” is a writing trick to avoid having to define exactly what those are whilst still expressing the sentiment.
> I reach for the word ratio because the concept (of two parameters whose magnitude varies and that would be meaningful in some relation) is a friend, and the word “better” is a writing trick to avoid having to define exactly what those are whilst still expressing the sentiment.
The use of ratio is entirely appropriate for conveying this, even in non-technical settings (and such use is pretty common). But to my ear, there's a definite implication that the relationship involves the two things typically moving in the same direction (and roughly linearly) such that it's notable when they've moved in opposite directions. (It's still very much qualitative - we're not really going to be able to assign meaningful numbers). Had you spoken of complexity/power, I wouldn't have noticed anything unusual.
Even so, I only really commented because I was enjoying playing with the idea of a simplicity/power ratio being somehow informative.
Wouldn't the first stable release of HTML count as "a drastically simpler set of web standards" together with the fact that any browser should support it?
Everything else is to me a matter of compression. Netflix's success seemed to me 50% good marketing and content and 50% compression and data handling.
It is wrong to think that a webpage has some sort of canonical view on it. And FB, Twitter and even Forbes is all about generating the impression of group perspective being one thing as opposed to be views on one, perhaps elusive, thing.
HTML4 and CSS2 are drastically underspecified. Even for sites where that's a sufficient feature set, you need to reverse-engineer browsers to figure out error handling and such.
The challenge with standards is feature creep. Example: amd64
One day Intel goes and implements instructions which support video decoding. And then Microsoft takes advantage of them, and now users have a better experience. Now open source compilers have to implement them and AMD has to implement them and the cost of competing with the giants goes up.
Same thing with browser DRM. Either Firefox implements it, or Netflix tells all of their users that chrome is a requirement.
For one, it would've helped if Firefox didn't drop the ball in the late-00s, when Chrome became the de facto best browser around. Google + Chrome did more bad for the web than Microsoft + IE ever dreamt of doing. Having the very same companies that make browsers vested in certain web features is a big no-no, but that cat's already out of the bag.
> Google + Chrome did more bad for the web than Microsoft + IE ever dreamt of doing.
I'm not sure that's really true. Do you remember the bad old IE days? Half the web was simply broken on non-Microsoft operating systems, because Microsoft refused to follow standards.
They built proprietary extensions to the web, which the rest of the world - and particularly the open source world - had to reverse engineer and spend ridiculous amounts of effort to implement. And despite all that effort, we often never managed to do it - cough _ActiveX_ cough.
The situation back then was much, much worse than today in terms of open standards and cross-browser (let alone cross-OS) compatibility. And that was pretty much Microsoft's fault.
Google and Chrome have lots of problems. But they build on top of Chromium, which is completely open source. It does not have 100% of the features of Chrome, but from a practical perspective that seems to be mostly a non-issue, at least for me.
"They built proprietary extensions to the web, which the rest of the world - and particularly the open source world - had to reverse engineer and spend ridiculous amounts of effort to implement."
One example is XHR, which is probably the single defining feature of the modern web.
Good standards simply codify whatever "proprietary" features we've found are useful.
Microsoft broke the web in more spectacular ways than Google. Even IE6 itself didn’t always work properly.
But Google has done more lasting damage I think. The effects are more subtle, things work. ...The way Google wants. We may never know what was crushed or lost by submitting to Google’s will. So the harm will be less visible.
It's crazy how much dev mindshare Google captured with Chrome and V8 to the point where WebKit and JavaScriptCore are almost forgotten despite Safari making huge progress in JS performance before Chrome was even released.
IIRC, Netflix used Silverlight until DRM in the browser was good enough for their purposes. If it weren’t for web standards, Flash and Silverlight would probably still be in use, and any browser that deprecated plug ins would get the “this site doesn’t work here” message.
Honestly, and I know this isn't a popular opinion, but I'm okay-ish with DRM (Netflix has to prove, legally speaking, they're at least trying to protect IP). What I'm not OK with is the current Google AMP-page/hiding-URL shenanigans, Chrome limiting/banning ad-blocking plugins, etc. These are all very clearly Google-centric "features."
DRM is just security theater that makes us all culturally poorer. Give me the name of a movie on Netflix or any of the other mainstream streaming services, and I'll find you a high-quality copy of it, often with subtitles in many different languages, for the cost of a usenet subscription (you don't even need to worry about a DMCA notice from your ISP for torrenting). DRM is not stopping anything. The studios and streaming services have made it just convenient enough to pay, with nice UI/UX on top.
The reason people pay is the convenience! The DRM isn't required for that. If they dropped all DRM tomorrow, they would lose far fewer subscriber dollars than they spend on DRM implementations, license servers, key management, etc. They're just greedy, want control, and want to keep it illegal to break out of that control.
Meanwhile, any media "purchases" you make are gone in a blink if the company you bought them from goes out of business or just decides they don't feel like offering the service anymore. The funny thing is that most of the buy-to-"own" (where you don't actually own it) prices are similar to the cost of a DVD or blu-ray disc. Ditto for Kindle books and their paperback counterparts. All the promises of digital distribution giving consumers lower prices were predictable lies.
I despise DRM as much as the next guy, but its existence makes perfect sense. The moment someone explains to a C level at a studio how easily the content could "leak" from a site without it, it's out of the question to not have it. As the parent poster implied, yes it's security theatre to a large extent, but it does block the very easiest forms of content sharing (i.e. just sharing the video manifest or simply saving segments).
Regarding purchases and licensing, those are still stuck in the same business model and pricing as in the analogue days, this is true. But saying overall consumer prices are the same is ridiculous, since all-you-can-eat content subscriptions is a massive transition with no comparable offerings in the analogue past.
If you want to culturally enrich people, there has never been a better time to consume huge amounts of quality stuff for barely any money.
I think feature creep is the #2 issue after adoption, and I think the solution is to adhere to a strong set of priorities, namely: simplicity.
Take your example. In 2020 there are already many ways that video can be decoded simply, efficiently and with excellent quality. We don't need to accept marginal user experience improvement at the cost of simplicity. So we don't accept changes to the standards at the expense of cost implementation complexity.
It's about having a different mantra. Instead of an emphasis on backwards compatibility and bleeding edge user experience, there's an emphasis on a democracy by simplicity.
Right after a comment I realized #3 or perhaps #1
Most popular sites won't adopt the simpler web even if they can afford to do 2 version of their site, even if users like it. Because the lighter web most certainly will be worse for ads and tracking.
If you like CNN's, also check out NPR's and CSM's pages! VoA also has a Gopher mirror and an RSS feed with actual content instead of just a link. I'm still looking for more text only sites like these, but so far these 4 cover general news pretty well.
Maybe the right approach (feature creep beware) would be to bundle this lite web standard tightly with a pay per view API or some other monetary distribution scheme for web content.
Each of the two standards might not receive enough attention on their own, but they complement each other: getting rid of the necessity of ads means going lite is an option for website owners, and the lite web will allow for many different lite web clients (that might finance itself through in-app ads shown to the user). In combination these two might overcome some threshold and gain traction!
It was never terribly popular, because who wants to make two sites? Really, that plauges all such plans; everyone wants to build the fanciest, most "modern" site possible, and not do it again for a more constrained version. That's why few sites offer a JS-less version. Only AMP has made some headway, with the weight of the Google hegemon behind it.
The obvious target is apps, develop a simplified layout/styling model for properties which can be scoped within a component, and build a layout engine which can calculate the layout within those components independently, and then you could radically improve the performance of web UI.
The limiting factor on making a new browser isn’t specs, it’s existing websites. Very few people want to use a browser that doesn’t work with existing websites, which actually use all the existing complexity.
This is true for my daily website driver, however there are a lot of situations we might want to use browsers or browser like things where this doesn't have to be the case.
One natural place is for packaging apps. I have a lot of modern web apps packaged so they run as if they were separate applications rather than my main browser. A browser that was focussed on doing this well would still be very useful even if it only worked with the most up to date sites.
Lots of game UIs use internal browsers, which again could be another niche.
And, especially for languages with limited UI capability, a good embeddable browser could provide a decent way to build UIs. Again a niche where backwards compatibility is not so important.
I'd be pretty happy to make a split where I use one browser for document consumption and a totally different one for applications, and perhaps yet another for really old school sites. In an ideal world they could launch each other for the appropriate sites.
There's browser engines made for niche uses like you describe, where they are intended to serve custom content only, and not real web content. Doesn't seem like they need a different set of standards to be built, though.
I'm thinking the opposite: lose floats. I definitely want to keep flex and grid.
A team at FB implemented flex to power React Native. I'm thinking that effort was made much easier given that they didn't have to account for floats, etc.
If you're writing a document (not an app) and using floats for their intended purpose, they are useful - and can't easily be replicated with grids or flexbox, I think.
Yeah that's a good point. I've always thought pages should, then, specify if they're a document or an app. If you're a doc, you use all the various semantic tags. If you're an app, just use div/span. Maybe the spec could have something like that, app mode and doc mode.
I would be OK to get rid of all CSS and other styling, and make it up to the user preferences instead how big a <h1> text is, how big a <h2> text is, how big the normal text is, what colours to use, what fonts are in use, etc. Make a more user-oriented specification, designed primarily for the user to control, assuming the user is an expert at it, rather than the author of the document.
> assuming the user is an expert at it, rather than the author of the document
But the user is NOT an expert at styling a website, this is why we have designers to figure out which combination of layout, fonts, and colors work well together.
Users generally don't give a shit about styling as much as UX designers justifying thier salaries. There are probably more people who suffer daily from accessibility on the web than UX designers in total
Well, they will make styles that are slow and/or that many users do not like. Instead, let it be subject to the user configuration; the web browser designer can put in suitable default settings.
Some browsers have (had) user stylesheets, which can do exactly that.
Firefox (barely) still has that option but it's disabled by default, Chrome had removed it many version ago, and Edge never had it from the beginning. IE11 still has the option.
I think it may be helpful to have some privileged commands which are only allowed in user stylesheets.
(Additional unprivileged commands may also be useful, such as ability to specify colours by index number, and ability to specify the names of user configuration values where other values are expected.)
This is something that I had envisioned [0][1]. I haven't really worked on it lately, but one thing quickly became apparent: were it not for the Google-ification of AMP (e.g. how it's used, tags for ads, etc) they had the right idea. I think you can go a long way with a backwards compatible subset of HTML and CSS only taking the latest/best of both, disallowing JS, and having an explicit goal of ease of implementation.
Exactly, just a subset of only the modern elements of HTML & CSS with perhaps a simplified layout could go a long way. For example, if you're building a electron type application you don't need the whole gamut of HTML/CSS, but only a fast well supported subset, especially a subset that works well on GPU's. I've wondered if Servo could be useful for something like that.
It seems sort of cool, but I don't see how anyone would ever agree on what should and shouldn't be included in the "simplified" standards. Instead, you'd have Andy's custom browser implementing 15% of the web, and also Beth's custom browser that implemented a different 15% and so on. Then Chris tries to build a website. If he uses only the stuff that Andy's browser supports, then it'll work there, but probably not in anybody else's custom browser, so what's the point? Just build to the real full web and let everyone use conventional full browsers. Each custom browser implementer would then be incentivized to implement more and more stuff to try and gain a bit more adoption. They'd either just give up, or try to match the real browsers, fail, and then give up.
It would be lovely if the whole firefox's html/css/js engine was compilable into webassembly.
A new browser could implement webassembly compiler and use firefox rendering engine as a fallback for when their novel rendering engine doesn't support some feature on a website.
Taking it further - the website author could possibly specify rendering engine it prefers to be rendered with(as specific version could simply be downloaded on demand from cdn, like common js/css libraries are).
And pure webassembly apps (ie flutter) could skip the html/css/js bloat altogether.
This won't do, from the myriad issues I'll pick one: you might encounter an unsupported feature seconds into page loading, or even later, after user already entered some data.
That is if you allow dynamic code loading or such dom manipulation to allow it. But for such cases, you should have already started the fallback engine the first time you scanned through the website code.
But fair enough. In the wild you would have to use fallback pretty much always.
Still, webassembly-able gecko would be handy and would allow for experimenting with above mentioned streamlined 'web standard'. Web author could simply sign it's compliance to the 'standard' using meta tags, http headers or some other way.
>I believe this would open the playing field for many people to create browsers
That's why it's not going to happen. Even if you managed to reset the cycle it will just happen again but this time even faster. EEE/standards corruption is just too powerful, I haven't read a single successful strategy to stop that long term. So even a parallel subset of the web doesn't seem immune to that. Like the other day I was reading about KaiOS and you can guess who's already investing in that platform.
Google (and the subsequent overlords) are cancer and there's no cure.
> Such a simpler web spec would be relatively fast moving, not focused on backwards compatibility, but instead on simplicity of implementation.
To mirror userbinator's comment: why would it be necessary for the standard to be fast-moving, if the intention is to offer a radically simplified subset of the web stack?
If the aim is to make it much easier to implement a browser, stability should be a top priority.
I'm reminded of a recent HN discussion on whether it makes more sense to define a minimal subset of HTML, or to use an entirely different language, like Gopher and Gemini. [0] I see several others in the discussion here have already mentioned these two.
Another feature would be "meta-CSS", which would be only for user stylesheets (and not usable in web pages), and can apply CSS in CSS, for example:
- Apply an animation (or other style) to any CSS styles that specify "text-decoration: blink".
- Specify what colour to use when a CSS rule specifies "background" as the colour name.
- Make all transitions (or animations) twice as fast or twice as slow.
- Prevent certain CSS commands from being used entirely, or change their meaning to a different command.
- Select elements by the CSS rules that the document applies to them (even if those CSS rules are disabled, and even if class names are unpredictable).
- Define exactly how big a "in" or "px" or whatever unit is.
I want to add my personal bugbear, sortable and filterable tables. And Lists of all links on a page. Oh, also expose RSS feeds again. And what about standard form controls that actually could be styled completely with css? Really, the more I think about it, the more come to mind.
While it sounds like it would make browsers more complex, I think it would actually reduce complexity, because the browser would not need even more programming capability and APIs just to enable web developers to create these kind of features themselves in a thousand variations of Javascript that adds bloat to every connection and slows down end user devices.
I agree with these things. I forgot about sortable and filterable tables, but it is correct it should be having. (I would also like the ability to override the browser's default styles without overriding those of the web page, in addition to the ability to override the styles specified in the web page.) If the browser uses SQLite databases for anything (such as bookmarks and cookies), let the user enter SQL commands to sort/filter HTML tables (and export them too, since the commands are entered by the user rather than the web page author, they are privileged).
And, yes, it would reduce complexity in the ways you specified, in addition to improving efficiency and allowing the user more control, and these are good things to have.
Maybe someone will make a web browser program that can do these kind of things.
I didn't mean to suggest we should revert to old days. My vision would be a significant paring down of modern standards, updating relatively quickly even.
As OSs browsers are really bad. You don't have access to the underlying computer, the security model is broken. Just recently Apple announced that Webkit will clear local storage every 7 days (and why? Because the security model is broken). That's not very OS-like.
> We just need a few really good ones.
There is literally only one really good one: Blink. And it's not even that good.
> Does anyone think it would make sense to create a drastically simpler set of web standards, so that making web browsers would become much simpler?
Everyone wants simplicity but nobody agrees which parts are the superfluous ones. As long you pose the question vaguely enough people on HN will agree because any engineer knows "simplicity is good". But if you get more concrete about what to remove, watch the pushback:
- Lets remove https, http is much simpler. (The privacy and security people will protest: We wanted simplicity but not like that!)
- Lets remove all accommodations for accessibility - who cares about that stuff anyway? (Well at least the people who needs it does!)
- Lets remove flexbox and grid - table tags and spacer gifs were good enough for everyone when I was young! (The law of conservation of complexity: You move the complexity from the browser implementation to the design implementation. Since there are more web designers than browser developers these days, it is not a good tradeoff.)
- Lets remove colors and fonts and interactivity, the web is only intended for reading science papers! (Yeah just like the printing press was only intended to print the Bible, doesn't mean it it wrong to use it for other stuff.)
- Lets remove HTML - people can just download PDF's!
A new web stack would be awesome. Using latest technologies available and suppressing backward compatibility. Replacing everything from HTTP to HTML, CSS and JavaScript.
But 2 big issues:
1. You have to get the specs right from the beginning and for the long term
2. You have to get traction to move the whole web to the new standard
Hackers can do it, starting with a small user base, writing blogs on the new stack and improving it day after day, adding new features. Then more people start to use the new stack and Hackers start to build services on it and more people come because it's faster and better structured than the old web. Mission accomplished.
> Does anyone think it would make sense to create a drastically simpler set of web standards, so that making web browsers would become much simpler?
Yes, I bloody do!
And by the way there is a less radical alternative option: just give up support for all the legacy features, quirks and redundancies - perhaps this might simplify the code significantly already.
Another fact to keep in mind: there already are Gemini and Gopher.
And those Gopher browsers can be really tiny. I think both Jaruzel's Gopher Browser For Windows [1] and Phetch [2] are under a megabyte.
Rather than a new web standard or ignoring "legacy", I'd point out that there are even web browsers for the Commodore 64 and Apple II. You don't need to implement every tag, the point of HTML is to ignore tags you don't understand and it should still render. Pages with correct markup are still readable in ancient browsers that don't understand CSS. If your page isn't readable in Lynx and Links, you didn't code it properly.
You can't support every site obviously, but Links [3] has shown you can go a long way by just supporting a subset of web features. The speed when you're not trying to render pixel perfect layouts is astonishing.
> the point of HTML is to ignore tags you don't understand and it should still render.
IMHO it could help a lot if a browser could let you configure the way it treat a particular unknown tag: just ignore it with its entire content or treat it like another kind of tag it knows.
I was thinking this. The canvas tag displays its contents if the canvas API is not supported, while the script tag is ignored. This means the browser still has to know about the tag to be able to not-implement it the right way.
That's a great idea - displaying the contents of a script or a style tag would be a terrible experience. Letting the user configure it would help future-proof the browser too.
> If your page isn't readable in Lynx and Links, you didn't code it properly.
Lynx/Links could really use an update (excuse me if they already have it - they didn't the last time I checked). There is nothing hard nor improper in supporting/using most of the HTML5 semantic tags.
Markdown would need a better specification then. Today we have a number of extended implementations, want even more extensions and most of the implementations won't even handle the trailing double space (soft line break) the way it is meant to.
But indeed I'd love Markdown or something like that (AFAIK AsciiDoc is better) to be everywhere.
I actually evangelize Markdown on daily basis encouraging everybody to use Typora instead of MS/Libre Office in every case when there is no practical reason to use the latter.
My biggest regret is that Google Drive not only does not seem to natively support plain text editing, it seems to go out of the way to make it harder. I wish it could be more like Dropbox.
> Of course adoption would be a big issue, but that's always a big issue.
Choose an audience that is disenchanted with the modern web, choose a subset of HTML/CSS that reflects their needs, create a prototype that demonstrates the idea, then watch people adopt it.
This is more-or-less what is happening with Gemini. It is a bit different in that they modelled their ideas on Gopher then addressed the shortcomings of Gopher, but there appears to be some adoption now that a specification has been produced: multiple clients and servers have been created, while others are creating content. Since the community shares may common interests, growth will probably continue for a while even if popularity is forever beyond its reach.
Doing something similar with the web will certainly produce a different outcome. It may even exert enough pressure to create a "clean" subset of HTML/CSS for specialized applications that is easier to implement.
I think the adoption issue could be managed by the fact that Current browsers are sufficiently monolithic that you could implement a version of the simple standard as a WASM host module. It makes the huge things a little more huge but the light weight things more light weight.
I don't think it would ever replace the browser but I can certainly see it finding a niche for things like small communities like single board computer enthusiasts where resources are at a premium.
I have plenty of ideas that I have doodled over the years of how things could be done in the browser space, and I'm fairly sure I'm not unique in that respect. There must be some pretty good ideas out there.
The Web will fork at some point in the near future.
The WWW will become the world wide app server. Focusing heavily on an app like experience.
Then there will be a push for a text only implementation to bring back the good ol' days when people actually want to read something on the internet treating it more like a book.
There's nothing stopping anyone from publishing a primarily text-based site if they want, or an "app" site. The web isn't a zero-sum platform, there's room for everything, and no objective definition of what separates "documents" from "apps" to base such a division on to begin with.
No one wants the web to be forked except for people on HN who wish everything done with it since the 1990s could be sent into quarantine where they can't see it, but this isn't something the public wants, or that anyone is working towards.
The problem with adoption of the text only standard would probably be that the generic browser will support that use case just as well as the simple browser. In general it would be hard to choose between one or the other world.
Rather than that, how about offering a very simple CSS on top of RSS? So that feeds could be personalized a bit, should the clients choose to support this?
More likely, an emergence of curated and policed search/indexes [sic] of sites voluntarily subscribing to a particular web philosophy.
Simultaneously, blacklists of domains and browser extensions to scrub viewed pages of any references to sites not subscribing to particular philosophies.
You lost me there. If you want any kind of adoption, you need to be backwards-compatible as much as possible. Otherwise you're just building a toy for geeks to play with, and you end up only attracting "spec perfectionists" to work on it (that is, people who care so much about the spec/implementation being beautiful and elegant that they never successfully ship something people can use).
> HTML would have to be written correctly (eg. balanced tags)
This is a common misconception, unless you're talking about XHTML (which was mostly a failure adoption-wise). HTML is a variant of SGML, which does not require balanced tags (though you can specify that certain tags must be balanced, of course). Certainly you'd prefer to enforce them in some cases where it makes sense (like <em>), but things like <br> do not need a closing tag (and do not need to be expressed as <br/>).
Anyway, I think the overall issue with the web today is that people want it to be a complete application development platform + document layout system. The goal seems to be to be able to build any kind of application as a web app, and allow them to do anything a native app could do (though hopefully with better security). Not saying this is a good or bad thing, but if that's the goal, complexity is inevitable.
Balanced tags aren't really important, but _some_ sort of format that is understandaple and enforced is a good idea. The whole browsers guessing what the page actually meant thing isn't a great game to play.
"Worse is better" suggests otherwise. HTML conquered numerous existing markup languages (notably SGML). Though it had relatively little content compatability to worry about in 1990.
There are numerous "HTML page simplifiers" (most based on Readability's engine AFAIU), which might shim behaviour and compatibility for legacy pages.
And content itself is text, not code. Slavish backwards compatibility is not a strict requirement.
> Though it had relatively little content compatability to worry about in 1990.
I think that's really the key to this that makes the "worse is better" argument miss the mark here. HTML succeeded because anyone could open up a text editor, learn a few simple rules, and have a web page in short order. It's fantastically more complicated now, but unlikely to be supplanted because we have two and a half decades of HTML+CSS+JS out in the wild. People work hard on cross-browser compatibility because it's not going away, not because it's fun.
That's where the simplifying engines come in. Grab the crap content, simplify the DOM to a bog-simple standard document format, and render that to the reader. Readability, Archive.org, Archive.is, Outline.com, beta.trimread.com, etc., are examples of these in various forms. Very nearly always their rendering is preferred to the original.
And all that fragile, brittle content out there will eventually break. The question is when compatability is lost, and in the name of what.
Keep in mind that I'm specifically targeting text and textually-oriented document content. The modern Web can be considered generally as having four principle modes, three ofwhich I'd treat separately: documents, as described, commerce (probably hived into a dedicated application), media (likewise), and apps (which want a VM engine, e.g., Chromium).
A surprisingly large set of apps, and certainly many significant ones, are principally document-and-discussion engines, for which lack of an intrinsic model within the document markup and client presentation is the raison d'etre of those apps. Either having a paired discussion platform, or integrating discussion into the browser itself, would address much of this.
Other content elements which have become significant online include both advertising and DRM. These have been mistakes.
> Of course adoption would be a big issue, but that's always a big issue. I wonder why this wouldn't make sense to try, given the current state of affairs.
A huge issue. Nobody is going to use a browser that doesn’t work with 99% of websites.
Real life. Most of what we do online is something we can do offline locally as well, but we've moved to an online version because of the short-term conveniences. But as the long-term consequences begin to show, there's nothing preventing us returning to a primarily offline world.
I think the current pandemic is showing the opposite.
Right now I'm able to attend conferences I couldn't go to before the pandemic, because they have moved online, and some of the best ones are in far away countries.
Realistically, that can't happen in a primarily offline world. I'll miss them when they go back to offline, because I won't be able to attend any more.
But even before, many things I enjoy, as well as many opportunities, are not happening in any one location on the planet. They aren't local, and still won't be wherever I move.
Even reading & commenting on HN is not replicable offline. I've tried it: I've run real-life communities, places for people to meet and talk and make things together. As interesting as these are, the range of perspectives is narrow compared with the interestingness of an international community, even a niche-interest community like HN.
I think you may be right about "long-term consequences", but I don't think we'll find we can change most of what we do online to offline locally. Instead I think we'll find we just have to stop doing what we do online, and do something else instead. Hopefully something we enjoy, rather than something that feels forced upon us.
definitely worth doing. would be a massive undertaking. I'm not webdev enough to comment intelligently on _how_ to do it, but I think the general _what_ to do is something like... a system that is simple and internally coherent, designed specifically to a) court devs to build on it instead, and b) enable all the optimizations that have eluded browser vendors so far because the existing standards are so absurdly complicated. "parallel layout engine" and "a tab doesn't use multiple gigs of ram" are good for starters
then you make a browser that is much faster for sites written in the new thing and build in blink for fallback. also make something in the same niche as electron, but only using the new thing. win devs over and try to cultivate another "this site best viewed in" phenomenon. gradually demote the existing paradigm to second-class status
the important thing is you probably need a well-heeled patron but you don't need to win over the existing browser vendors. (tbh I'm surprised facebook hasn't tried this yet, they'd benefit immensely from it even aside from being able to stick it to google)
make something better and people will gradually switch. reaching non-technical types isn't as hard as it's made out to be, there was a point where every early adopter geek type was going out of their way to install chrome (and firefox before that!) on their parents' and friends' computers for them. and if people switch, other browsers will have to follow
of course it's also likely that all the problems that necessitate a switch will come back even worse after. google made a js engine that was 1000x faster so people made sites that were 10000x slower. google sandboxed tabs so a bad site wouldn't crash the whole browser, and now complicated sites crash constantly because there's less consequences to it. but hey you have to imagine sisyphus happy after all
" [...] which explores the space inbetween gopher and the web, striving to address (perceived) limitations of one while avoiding the (undeniable) pitfalls of the other."
I had another idea. It is its own file format (independent of the transport protocol; HTTP works just as well, or you could use DVDs just as well, too), which is a Hamster archive containing several lumps. There is its own document format, which lacks support for styles and a lot of other stuff, but does include some commands (e.g. footnotes, data tables, emphasis, headings, hyperlinks, lists, fix-pitch), and there may also be lumps containing executable code (which is optional, as are the document lumps). The executable code is sandboxed and can do no I/O at all (including random numbers and date/time) without an extension. There are standard extensions, and the user is required to be able to configure the extensions, to enable/disable them, substitute their own implementation, or add a proxy to them. If there is network communications, a PROTOCOL.DOC lump (which is meant to describe the protocol in use, but may be blank) is mandatory, in order that the user can reimplement the protocol by themself. Extensions must be open source and fully documented (in order to be listed in the main documentation, and listed in the installation menu of the main distribution). (Some standard extensions would include the document view, a command-line interface, a terminal-based text interface, date/time, random number generation, network communication, and files.) Documents may also be stand-alone. Extensions are identified by a sequence of UUIDs.
I wonder if it would be possible to distill the "rendering essence" of HTML+CSS, i.e. have a pre-processor that transforms a lot of redundancies / complexities out to just a hierarchy of spans+divs with style attributes.
For a modern browser, the two hierarchies would need to be dynamically linked, but specifying the "view hierarchy" in terms of a (very limited) HTML/CSS subset should yield the advantage that the correctness of the transformation step could still be inspected with a browser?
This was the idea behind XSL (and DSSSL before that). You transform semantic markup into a pure presentation format. By moving the transformation to the server you can have all the complexity of selectors, rules, inheritance, cascade etc resolved on the server and have the browser just receive low-level rendering instructions.
Of course you can't have any form of dynamic HTML and accessibility would go out the windows.
I think there the presentation format typically would be a different language (such as xsl-fo)... I am suggesting a transformation to a subset of HTML+CSS
I mean, you might be able to get away with implementing a WASM VM and a small subset of JavaScript APIs, which might be significantly simpler than implementing all of JavaScript itself.
I talked about something like this in my comment. One solution I was thinking is that even if it's a parallel engine don't you still save on resources if some of the most popular sites use the simpler/faster layout because the memory/cpu overhead is significantly lighter on the new layout? And the bigger/popular website has a motivation to update if it's actually faster and more responsive.
I was thinking about breaking it down to just make a fun sandbox VM with some APIs for network, local storage and interacting with the user. No document format or anything, you get a screen to draw on and get events. And then I thought: "hmmm, everybody's going to be disappointed that the VM isn't for their pet language", so I came up with the idea of just using QEMU. Literally just give every site a bare machine it can load any image on. Make a virtual IO device for the system services (like exposing the path and query part of the URL, clipboard and linking to other machines). UI, storage and network get normal VirtIO devices.
Let's keep HTTP for metadata and cache control (don't want to download big images unnecessarily), with a bunch of headers for negotiating preferred CPU architecture and other hardware stuff.
It's different enough from the web that it might actually work, for some value of "working".
I think it's a worthwhile endevour. But the key is to look at how systems like the web stagnate and to design a system that can avoid the same fate.
Every project that grows large suffers the same inevitable descent into complexity. Things become so complex that it becomes hard to try out new ideas, which leads to stagnation. Eventually a better and leaner successor upsets the incumbent and the cycle restarts.
The key to creating great ecosystems is speed up that cycle. Design a system that encourages rapid growth and failure.
The 'web' should become a minimal hardware abstraction layer that offers safe access to graphics, audio, filesystem, and basic networking. Everything else could be built on top of that.
If someone has a new great idea for HTML or CSS just build it on top of the abstractions and hope others become interested. If someone wants to build a new browser they only have to implement the more manageable core.
> Such a simpler web spec would be relatively fast moving, not focused on backwards compatibility, but instead on simplicity of implementation. HTML would have to be written correctly (eg. balanced tags), old styling mechanisms would be removed so that layout engines wouldn't have to accommodate them. Everything would be pared down.
No need for new standards, you just implement recent standards properly and don't pay attention to the real world (as in "what people thought was HTML at which time").
It might be more worthwhile to clean up some existing rendering engine, factoring out kludges (for the aforementioned "real world") into code that can be disabled at compile time so we are left with a FOSS "pure specs" implementation of current specs. I personally would be interested in seeing what breaks, betting on "not much".
> No need for new standards, you just implement recent standards properly and don't pay attention to the real world
Unfortunately for your purposes, The specifications are defined in a layered way, and you can't just implement the recent ones without everything underneath. And then HTML5 codifies a lot of "real world messiness": early browsers did all sorts of strange things, and then layered on even more strange things to try and be compatible with each other. HTML5 threw away the approach of specifying the way things would ideally work, and instead focused on specifying the way things actually do work. Which means to implement HTML5 fully you really need to do quite a lot of work.
#1 80/20 is more than sufficient. If your browser legibly renders Facebook, Bootstrap, and maybe a dozen others, call it good.
#2 Fidelity is overrated. With adblockers and reader view, who cares about pixel perfect? Twitter, Reddit, and most other popular websites already look terrible. A better web browser doesn't help.
#3 Sites that care about that extra polish should use bespoke layout managers.
For a while, I had a thing about design grids and ensuring text baselines were properly aligned. Spent way too much time wrestling with layout managers.
Finally gave up and rolled my own. Less code, easy to debug, got exactly what I wanted.
Always had a notion to "port" my design grid based layout manager to the web, but I just don't care any more. I consume most of my news via RSS. Assume my target audience would do the same. So any future content I publish will be as stupid simple as possible.
> Twitter, Reddit, and most other popular websites already look terrible.
There's terrible, and then there's terrible. If you try to implement things in a much simpler way, you will get many completely unreadable websites. Images will cover text, some text will be off the screen, it will be a garbled mess.
Many times I have suggested this idea on HN. Always gets shot down. Perhaps the problem is that such a move toward simplicity is perceieved as benefitting users more than web developers.
There are certainly folks at hosting providers and similar service companies who advocate using different browsers for different purposes, e.g., for security reasons. For example, it makes little sense to use the same program to browse random sites on the web as you do to log in to your bank's website. However there are more reasons that just "security" (namely performance, IMO). If "security" is the only reason one would use a different program, then people just point to "sandboxing" and use one browser for everything.
As for paring down HTML, isn't that sort of what Firefox "Reader" mode or AMP does? If you try viewing some AMP urls in links text-only browser, they look particularly good, and the news site "paywalls" do not work. I have been using text-only browser and other, smaller programs to perform text retrieval from the web for many years and they work very well, much better than the gigantic omnibus everything-in-one programs supplied by the ad tech corporations.
Thus, the responses that claim "It would never work" make little sense to me because in my case it has already worked for decades. I doubt I am the only user who values speed and simplicity.
The problem is that such a browser wouldn't be any use to users in the short-term, because most of the web simply wouldn't "work" in it.
So, I don't think it's developers that are the problem. As a web dev myself, I'd much rather have less, mostly unnecessary, complexity to deal with, but I can't see a rational path towards that.
I had a very similar idea and started to prototype a layout engine, tokenizer and parser. I was able to render things about 1000x faster in the basic case (rendering styled text, boxes and images). The problem I can't crack is mass adoption. If you have ideas on that, call me. :)
There is no need to aim for mass adoption. Make it a standard for whoever wants a simple html environment. It could be situated between gopher and the full html spec.
People use command line based browsers. A limited browser is usable, just not for full web apps.
To start it, offer a website that checks websites for their compliance. At the same time, let webmasters register their site so that you can offer a directory of available content.
If you want to monetize your project, offer a search engine with ads for all sites that passed the test.
The icing on the cake would be a proxy service that transcodes complex websites into the simple standard by analysing the site with a headless browser.
I'd backdoor it into a app framework. Don't have to call it native, just a framework for making apps that happens to have a way to serve them to a user over TCP.
Yep. This was one of the possible directions but it also has an initial startup problem. Right now only large app makers are making these types of frameworks because they don't make any money. (Facebook, Microsoft, Google, Twitter, Apple)
I didn't get super far, but far enough to see that an alternative to HTML/CSS was very fast and certainly viable. I also haven't been able to figure out if it's a cool software project or a business. My sense is that it's a cool tech but Netscape & Mozilla never had a strong macro-business when compared to Google or Microsoft.
This would be great. I also think there is the issue of HTTP being a good protocol and the web being a good distribution platform, and as a result browsers have to be complex because they are everything to everyone. It'd be cool to have a really optimized browser/game engine with WebGPU, WebXR, Gamepad API, audio and Wasm, and no or minimal HTML support as an application platform, for example. Or a browser that has a unified fixed UI for streaming video. I understand these aren't perfect examples and can be nitpicked and might not work in practice, but I strongly agree with this idea in a general sense.
I cannot imagine it will ever happen but I'd like to see the spec precisely defined in terms of core "axiomatic" functionality (JS, layout engine, core CSS rules), and peripheral functionality built on the core functionality.
The core functionality would have a precise (as possible) and extensive definition with an agreed test suite encoding the expected behaviour (as much as possible).
This would allow development of a shared implementation of the peripheral layer, while browser innovation could continue on the core functionality (JS and rendering performance, battery usage etc), and on innovations in the UI.
"create a drastically simpler set of web standards, so that making web browsers would become much simpler"
That was the goal of web standards with xhtml cira 2000-2008. It ended up having the opposite effect of being "relatively fast moving" and really slowed development down.
At some point they realized that figuring out balancing tags wasn't really all that hard for browsers to implement. For old styling mechanisms, browsers can just warn against using them.
I don't really like <marquee>, and believe it should be user configuration to just display static or to manually scroll it; the user should also control the blink rate for the <blink> command too (including zero if they do not want it to blink)
If one does a simple version in my opinion it shouldn't move fast, but be extremely reluctant to move in order to enable creating archivable documents.
I’d love to have an Internet that’s a bunch of markdown files that link to a bunch of other markdown files (or a format that simple). No JavaScript, minimal CSS, and support for various image types.
Only problem is how to deal with navigation to other parts of a website.
Something like this would be hostile to advertisers and bloat. Ideally it only has essays, papers, and other stuff that makes you smart.
This subset of web standards was proposed long ago: WML (WAP Markup Language). Do you remember WAP browsers?
Now we have AMP (Accelerated Mobile Pages). Why not build a web browser focused especially on this? Actually I made one: AMP Browser (https://ampbrowser.com)
I can imagine this being used for a desktop app with limited web access: pages using only html and css will be correctly displayed; pages requiring js would be degraded. This way you can safely extend desktop app with access to web resources.
This is not about the browsers, but about the content the server delivers. When you use one of those limited browsers today, you will encounter a lot of broken pages.
But if there would be place in the internet where every page would stick to the same limited feature set, the new browsers could focus on those features and there users would new a place where they would not encounter broken pages. In addition, users of the traditional browsers (probably the majority of users), would still be able to visit that place too.
If Google had not focused on making it possible for publishers to earn money from their AMP pages, why would any of them have been willing to put in the effort to rewrite their pages?
With any sort of grand proposal like this, you need to think about all the different people in the ecosystem and why they're going to be interested in moving over to your system. I don't think AMP has really succeeded, but without having publishers on board it would have gone absolutely nowhere.
(Disclosure: I work on ads at Google, speaking only for myself)
The movers-and-shakers of web-tech are the large corporate (esp the two that control the browsers) and what are their incentives to simplify web tech, thereby lowering the bar for competitors.
i'm approaching this issue from the other end: using a subset of html (and careful progressive enhancement) to build a site which works in every browser since the earliest days of the web.
i am pretty confident that, although i haven't tested it, it would work in both kosmonaut (if you download and save the files and submit content with something like curl, but use kosmonaut to display it)
i bet it would also work with that apple se / raspberry pi combo also in today's top page.
Yes! I like the original idea of HTML, where the client chooses the styling not the website. That way the web cloud be completely consistent, just pure information.
I think so? But I can't decide whether this hypothetical simpler web should strictly be document-oriented. I don't think the issue is that the web is now half application, half document oriented. It can still be both?
I think it should be an application programming language and only source code and assets should be transmitted. Crawlable content should be provided by a function in the language, if the author of the document/app decides so.
A good browser/server/IDE would both be easier and more powerful than any of the messy mixes of languages and document formats we use right now.
IMO, it makes more sense to allow a web site to specify its rendering engine in one of the HTTP response headers. The rendering engine would be WASM, and the actual site could be HTML or whatever.
As long as the rendering engines are distributed via CDNs, it would be extremely fast.
One of the good things to come out of Servo was the modular design, which allows other projects to reuse components, such as the CSS parser and the HTML parser. I’m glad that has proven to be worthwhile. I just wonder if there’s enough people in the community to keep those crates up to date.
I just had an interesting train of thought. People have assumed for some time now that writing a browser engine from scratch is intractable because of how large and complex web standards have become.
But what if you didn't have to implement all of them?
Now, some people would suggest we jettison JavaScript and/or CSS entirely. Or at least all of the additions made to them over the past decade. I think the idea that a browser like this would gain any traction outside of hardcore enthusiast circles is pure delusion.
Instead, what if features were prioritized based on how much of the web uses them? I'd bet that 90% of the web only uses 50% of the web standards out there. Some things like Flexbox are used on practically every new site that gets built these days. But there are dozens of obscure CSS properties that most web developers probably don't even know about, much less use. And JavaScript APIs? There's a host of bespoke progressive-web-app APIs (USB access, anyone?) that hardly anybody uses.
Also, a huge part of the web's baggage is maintaining backwards-compatibility with the entire history of content. This is well and good, but not an ideal that an indie browser could afford to uphold. Frames, image area tags, etc. There's probably a long tail of features - many of them deprecated - that could be jettisoned without having much impact on the average user's experience. Even things like "float", which aren't deprecated, may have been instrumental at one point but are no longer very important.
Prioritizing the standards that matter and de-prioritizing the ones that don't could dramatically cut down on the effort necessary for an MVP.
Taking this further: how do we know which features to prioritize? Most front-end devs probably have a rough idea, but what if we got empirical with it? What if we automatically tested the top 10,000 websites or something and made note of which CSS properties they used, which JavaScript APIs they called out to, and ranked them by frequency (and by popularity of the site?). We could chart a clear, direct path toward "what does it take for a browser to be useful in 2020?"
It's possible this browser could even include existing JavaScript polyfills (https://developer.mozilla.org/en-US/docs/Glossary/Polyfill) to help bridge the gap for things that it hasn't yet implemented. Leaning on work that's already been done by the open-source community.
The complexity in supporting CSS does not come (primarily) from its individual properties, no matter how obscure. It comes from things like margin collapse, from correctly determining the stacking context, and about a gazillion other things like that.
These aren't things where you can just scan the CSS of the top websites to find out if they're being used. These are things where you'd have to do visual comparisons to the output of at least two other browser engines to determine if you end up with the same result.
Building a browser engine from scratch is imho more doable now than it's ever been before (excepting EME), due to the insane effort by the WHATWG standards to truly describe what is actually happening in browsers (rather than coming up with some theoretically pure description of what is envisioned to happen), and the similar level of detail on the CSS side to the myriad interactions between properties, along with the huge set of testcases for all of that.
Yes, there's _a lot_ - but compared to how loosely specified it all was in the past, when the instruction for building a browser engine was: "reverse engineer the bugs the dominant browser engine of today made in reverse engineering the bugs of the dominant browser that came before, and emulate that to your best ability", anyone starting from scratch nowadays has a way better chance at succeeding.
My favourite example of the thousand-yard-stare horror of web specs is Manish's "Font-size: An Unexpectedly Complex CSS Property". It's awful and hilarious and just keeps getting worse and worse and worse.
Yeah so this is exactly the kind of thing I'm talking about:
-----------------
The syntax of the property is pretty straightforward. You can specify it as:
- A length (12px, 15pt, 13em, 4in, 8rem)
- A percentage (50%)
- A compound of the above, via a calc (calc(12px + 4em + 20%))
- An absolute keyword (medium, small, large, x-large, etc)
- A relative keyword (larger, smaller)
The first three are common amongst quite a few length-related properties. Nothing abnormal in the syntax.
The next two are interesting.
-----------------
I've been doing front-end web dev for nearly ten years, and I've never even heard of those last two, much less used them. That's the kind of thing a new browser could defer support for until after the MVP, while barely detracting from the average user's experience.
Though this does remind me that i18n is a thing, and how gnarly of a problem it must be for a piece of software so concerned with text flow/layout, and ideally it's not something that a theoretical upstart browser would punt on.
If people feel the need to use text inside their <canvas> elements, I've done some (not very rigorous) research on how JS engines interpret font size instructions in their canvasRenderingContext2d environments:
- Absolute size keywords ('xx-small', 'x-small', 'small', 'medium', 'large', 'x-large', 'xx-large', 'xxx-large') may-or-may-not work — and the resulting size may-or-may-not have a relationship to the <canvas> elements surrounding environment.
- Relative size keywords ('larger', 'smaller') can be hit-and-miss too.
- Absolute length values, defined with px, pt, in, cm, mm, pc, will usually work as expected.
- Viewport lengths (vw, vh, vmax, vmin) will often work; note that these lengths are set on creation and don't automatically resize when the viewport dimensions change.
- For lengths defined by the font itself, rem will use the root element's font size for its reference; %, em, ch can be less helpful. Again these won't automatically resize in a responsive environment.
- Of the rest, Q is not supported by Safari browsers, while cap, ic, lh, rlh, vb, vi are not supported by any browser. Avoid!
> I've been doing front-end web dev for nearly ten years, and I've never even heard of those last two, much less used them.
Huh, that's interesting to hear you say that. I hardly do any web development at all, but I was using both of those things for personal/toy web sites over a decade ago.
Just goes to show you that even rank amateurs can end up exposed to things that professionals haven't seen, for whatever reason.
While I think you're right that the last two are very uncommon, they are not a large contributor to the complexity of handling font sizing. Once you've done all the rest, I suspect you could add them with <5% more work.
Leaving out rare features to build a browser more quickly only makes sense if those features let you remove a lot of complexity from your implementation.
> These aren't things where you can just scan the CSS of the top websites to find out if they're being used.
I guess I was drawing a distinction between a certain subset of functionality that's definitely being used constantly everywhere, like the core box model, vs new features that have gotten layered-on over time and specifically designed not to interfere with or change what came before. For example, the CSS Grid standard has zero effect on any part of page layout unless it is explicitly invoked with "display: grid". These hard barriers were drawn to maintain backwards-compatibility, but they could be leveraged to carve out pieces of functionality to not support, or at least defer support for.
> Building a browser engine from scratch is imho more doable now than it's ever been before
I agree. And I would actually add Rust as a factor for that. Don't forget, Rust was literally purpose-built for building a web browser. With its focus on memory safety and safe concurrency, I'd bet it will act as a very real force-multiplier when it comes to a project like this. Devs will spend that much less time chasing down race conditions and memory errors, while at the same time getting something highly parallel and performant.
“due to the insane effort by the WHATWG standards to truly describe what is actually happening in browsers”
I wonder whether one could reuse testcases from other browsers. Might be easier than having humans translate those WHATWG descriptions, written for humans, into a form that computers running tests can use.
And nitpick: it isn’t easier than ever. It was a lot easier for Tim Berners-Lee and the first few other browser writers, before there was a lot of agreement on how a browser was supposed to behave. Certainly, before JavaScript and css, SVG, XML, etc. The scope was a lot smaller.
Hmm, you might be onto something here. Someone created a react layout generator using gpt3 that translates natural language description into actual layout. Maybe some ml models can be developed to render a layout from HTML content. No need to perfectly render it, just 80% approximate would already impressive enough.
I've been wondering for a while if we could create an extremely simplified rendering engine that would only accept modern markup and css:
- forget everything about quirksmode and all kinds of workarounds that browsers have today.
- only accept Javascript from a vetted repository that contains things like autocomplete and ajax reload.
- and here comes the smart part: embed it next to an ordinary engine, ideally in Firefox, add a meta tag or content type or something that will get the browser to try to render it in this engine.
1. The idea is to get it to run extremely fast, and
2. get a few websites to optimize for it (Wikipedia?)
3. once people recognize certain pages loads extremely much faster in that browser they'll flock to it
4. more sites will start optimizing
5. since Javascript is extremely limited we get back to a saner content web
A better use case for this engine would be Electron replacement.
Much easier to get adoption if you can demonstrate performance benefits and electron apps are more performance sensitive than webpages in general (who really cares about Wikipedia rendering speed - it renders fast enough)
You may be interested in a project that's doing almost exactly that: a new content type that sits alongside the "classic" web, with a heavy focus on performance and only JavaScript from a vetted repository allowed.
What downsides does AMP have that your proposal would not?
Another way to think about this is, people were already thinking along these lines and tried to build something, and that thing is AMP. If you want to build something that avoids the failings you see in AMP, you're going to need to think hard about how your plan is different.
Those are mostly not the case anymore for AMP; the only one that looks correct to me is that AMP documents have to include the AMP runtime. If you wanted to make a pure HTML+CSS page, with no JavaScript, I can't think of any technical reason why the AMP specification couldn't be extended to consider that valid AMP. I think the main question is whether there are many sites that would be interested in serving that way?
* AMP is not specific to mobile, though it did start that way; there are sites that serve all of their pages in AMP format to both mobile and desktop users.
(Disclosure: I work for Google, speaking only for myself)
> If you wanted to make a pure HTML+CSS page, with no JavaScript, I can't think of any technical reason why the AMP specification couldn't be extended to consider that valid AMP.
> Meaning that right now it isn't that way, right?
Correct, the current spec doesn't allow that. But, as I said, I think that is something that could reasonably easily change if many sites wanted to publish with vanilla CSS and HTML.
> For some reason everyone still seems to associate amp with Google, and the only times I can remember finding AMP pages are when I search with Google
> But what if you didn't have to implement all of them?
People will very much tend to prefer using browsers that are fully capable. If your browser only works on 90% of sites, that's not good enough to keep users, it's exhausting to keep switching back and forth.
To escape the modern web, you have to offer something that the modern web can't which people are willing to go out of their way to get.
Some ideas: privacy, anonymity, no tracking, no DRM, global identity (pubkey based). I don't know if any of these would actually be sufficient to drive a new market, but I think creating a new market is the only way.
The way Firefox initially solved this was with a user-facing version of the Strangler pattern: IE Tab. You could simply tell FF that a given URL needed to be rendered by IE and that would be the case inside your otherwise strictly FF browser. It worked perfectly. That pattern could be reused in such a scenario I guess.
That isn't how it was. When Phoenix was initially released, it was running the Netscape rendering engine with everything else stripped away. There was no IE Tab. Throughout the process of becoming Firebird, and then Firefox, IE Tab was never part of the core browser or default installation. I know: I ran all of these browsers as they came out.
Even though many sites were built specifically for IE, it was very rarely bad enough that you couldn't manage with Firefox. Compatibility hadn't gotten that bad.
(If you were going to make a dramatically simplified browser today, however, I do think this would be an excellent route to go.)
IE Tab was an extension, never a native feature. It was one of the many innovations allowed by the open extensions API, allowing anybody to make major features without formal approval.
Pubkey based is good, but a global identity is a recipe for disaster. Ideally, you derive a unique identity per domain, while still managing only a single (master) key pair at the user-level.
> Some ideas: privacy, anonymity, no tracking, no DRM, global identity (pubkey based).
These are strongly in conflict with each other: if you have global identity what keeps that being used for tracking? This is the controversy around advertising IDs in mobile apps.
I'd pay money to see how would a bunch of mainstream users would react if they only had https://lite.cnn.com/en or https://text.npr.org/ and similar variant for online services (granted a few pics allowed for shopping)
ps: 10$ more on a 'this lowered my medical bills' outcome
I enjoy the lightweight web too, but I want to stress that that's not what I'm advocating for here. You can't simply expect the population at large (or perhaps more importantly, the corporate community at large) to seismically shift the way they're doing things for the sake of web idealism. You have to meet them where they are. What I'm talking about is finding a pragmatic way to do that.
Usually when a feature - especially a CSS feature - is missing, a site degrades gracefully. This varies between features of course, which could be another factor in prioritizing them: how dramatically will the absence of this break something?
But if a CSS property is invoked that the browser doesn't know about, it simply ignores it and moves on. The same goes for HTML tags and attributes. This is less true for JavaScript features because accessing a field of an object or attempting to call a function that doesn't exist will throw an exception and cease execution. Though on sites whose JS is mostly peripheral to the content, this can still sometimes result in a mostly-functional site. You could also play a game where you stub out enough of the APIs to prevent exceptions being thrown, without actually fully implementing the features. i.e. make an API function callable, and just not do anything.
The idea isn't to write off parts of the web entirely, but to be smart about how things are prioritized and aim for graceful degradation of the overall experience, as a way of dramatically lowering the bar for what it takes to make a browser that could be reasonably used day-to-day.
Initially I imagine anyone who downloaded this browser would know exactly what they were getting into. I'd download something like this if it was noticeably faster and leaner than other browsers, as long as I could do some kind of quick switch to another, like I do with bangs for Google on Duck Duck Go.
The quick-switch idea is interesting. I don't really know how it would work here (embedded Chromium?), but I think the analogy to DuckDuckGo is the right one. DuckDuckGo started as a service for privacy-enthusiasts, and as privacy awareness has become more mainstream we've seen mainstream adoption of it despite its shortcomings compared to Google's results. It a) offered something Google couldn't, and b) became "good enough" on the other axes, and that was enough for regular people to adopt it.
I presume it would in a similar fashion to Zoom on Mac (I have't used Windows in a while) where Firefox prompts you to open a Zoom meeting link in another application (i.e. Zoom), which you can ask FF to remember so the link is automatically opened with that app next time.
Hmm. I think the destination app has to support that directly though, doesn't it? http: links should automatically go to your default browser, of course, but that would require that this new browser is not your default browser. Maybe you could hack around it in some way; I'm not a native desktop app developer so I don't really know just how much leeway one of those has within the host system.
This could be coupled with a search engine that indexes the pages which such a browser has been tested to display correctly.
So you start off with a small index that includes Richard Stallman's web site and a few thousand others, then add many more as things like flexbox or whatever get added.
That feeds in nicely to the development process. The implementation of each new feature has the potential to open up thousands more sites to the index. In fact you could list bounties not in dollars but in the number of new sites a given feature can bring to the index.
That way users look at the search engine to first discover the sites that work in the browser. You could go retro like the hand-crafted directory Yahoo used to use, or automate the process. Either way, it's a big improvement in UX. Compare: "While the supported sites do load fast and look great, their search doesn't cover enough sites for my browsing needs." To: "Wouldn't even load Reddit this thing looks hopelessly broken."
I bet that it wouldn't take too much effort to get such a project to a place where you get a sizable index with the core feature being the lack of inclusion of sites with dark patterns. Like, an hour spent in this browser is filled with 80% critical reading whereas you'd just be infinitely scrolling through memes 80% of the time in Chrome...
You still need a fallback solution for cases where the website doesn't work with your version of the standard. Otherwise, users would be very disappointed, and they would switch browsers very quickly.
I like the idea, but the migration path needs some serious thinking.
The idea is to make a "best effort" solution. Once it's good enough, it could be tolerable for regular users. The next step would be to offer something new, that Chrome doesn't offer, that's compelling enough for the average person to stick with it despite the occasional hiccup. I don't know what that would be. It could be a genuine privacy guarantee (due to the lack of profit-motive), it could be less bloat and therefore longer battery life on their devices. Or it could be a UX-level rethinking of how a web browser is structured.
Though for semi-enthusiast users (HN readers), who want things to mostly just work but are willing to put up with some minor discomfort for the sake of idealism, that second part may not be needed.
The story could be different with a dedicated foundation of enthusiasts, backing a fully open (read: not-for-profit) project. It could also not be. We don't know until we try.
Even if most websites only relies on 50% of the web spec out there, it's likely that many of them rely on a slightly different 50%. That would mean the "common" area needed for a good average experience is significantly larger than 50%.
I think this is the trap that non-Chromium MS Edge fell into. I don't doubt that they supported a large subset of websites very well, but the average experience was brought down a lot when it randomly broke or massively slowed down on random websites.
The MS Edge case also suggests that it's not enough to just support a standard, it must also be reasonably optimized, because websites can be intolerably slow otherwise. I think many websites, especially web apps, assume that the user's browser is quite fast and can chew through a lot of load.
> I'd bet that 90% of the web only uses 50% of the web standards out there.
If implementing 100% of web standards is intractably large and complex, then:
Implementing 50% of web standards is intractably large and complex too.
Unfortunately, you can't turn a humungous problem into a friendly little one by dividing by 2.
I would think in terms of dividing by 10 or more. How much of the web only uses 10% of the web standards? My guess is most of it, so we have a chance.
> Taking this further: how do we know which features to prioritize? Most front-end devs probably have a rough idea, but what if we got empirical with it? What if we automatically tested the top 10,000 websites or something and made note of which CSS properties they used, which JavaScript APIs they called out to, and ranked them by frequency (and by popularity of the site?). We could chart a clear, direct path toward "what does it take for a browser to be useful in 2020?"
This would be great data. Something a bit like caniuse, but showing feature usage rather than browser usage.
Getting this data and maintaining it is a full-time job for a team. You can't just fetch the front page of sites to get this info. You need to login and use functionality. Perhaps existing browser telemetry would do a better job, letting users generate the data during normal activity.
However you do it, the cost of this idea is mounting up fast :-)
> I'd bet that 90% of the web only uses 50% of the web standards out there
The problem is the same one faced by people trying to build MS Office competitors back in the 90s and 00s: while that's likely absolutely true, they don't all use the same 50% of those standards/features, so you end up having to implement all of them, or your users will blame you when their favorite random niche website doesn't work, and ditch you for Chrome.
I had a similar thought: Why not build a browser-generator? Some nice presets and checkboxes for options and a single shiny "Build" button.
This can't be too difficult, am I right? It would give users the explicit responsibility for broken pieces and also some nice performance gains.
I think there's a great case for a browser which defaults to Reader Mode. Reader Mode is always a better experience when it works. And then build up from there.
I'm writing my web publishing platform to work on HTML4 and CSS2 standard so I can brag about how it's still performant on Mac OS9.
I just need background images, absolute positioning, <area> + image-map, and form submission to work as expected. No JS.
Classic Macs aside, I would be thrilled to find a no frills rendering engine that I could package up like an Electron app serving from localhost, but without all the performance expectations of electron apps. I'm imagining like what Ionic does for apps -- just wrap everything in an OS provided web viewer -- what's out there like that for desktop?
Interesting, but should be clearer about what this entails.
What are the native browser components currently using?
Windows: some old IE, EdgeHTML+Chakra, Blink+V8?
Linux/macOS: WebKit?
Ionic uses capacitor for native deployment. Capacitor is a thin cli wrapper and an api abstraction around each platform. In the desktop case, it’s a wrapper around Electron. For mobile, it’s Cordova.
Lots of good comments with good points but none about security so far. There is no lack of alternative browsers, really. Personally, I find projects that take a new approach highly interesting. More interesting than those that that clone what Chrome and Firefox do. I used surf, uzbl and luakit extensively, but what prevents me from adopting them as daily drivers is always the nagging concerns about security.
As unlikely as it may seem, I can well imagine that a dedicated team produces an alternative browser that - feature-wise and functionally - is good enough for daily use for most people. For the life of me, I cannot imagine they will come up with a browser that is as secure as Chrome.
Let's face it the amount of work and money that has been put into Chrome's security is amazing. As much as I love Rust and how it helps us write more secure software, it only gets us so far when it comes to the multiple threats a web agent implementation has to face.
A pure Rust browser is immensely more secure than any browser written in C++, at least against memory safety bugs. There is still the source of logic bugs, stuff like the same origin policy, but the worst that can happen is an XSS attack instead of RCE or similar. The browser would be ideal to access the 99% of websites you don't log in to (provided that the browser can actually render them correctly), and eliminate the main danger from them. For websites you log in to, you can still use Chrome.
Something like 70% of all CVEs in C++ applications (including browsers--the actual type of application doesn't seem to matter much) are memory safety issues. Yet the myth persists that memory safety isn't an important bug class for stuff written in modern C++. I think the converse is true: most C++ programmers tend to significantly underestimate the number of remaining bugs in their code that are memory safety bugs.
Bugs related to the JIT are normally counted separately, AFAIK. The 70% figure tends to hold even in systems with no JIT. However, it would not surprise me if about 70% of JIT CVEs are memory safety bugs. The trend for unsafe Rust so far seems to be very similar BTW (about 70% of unsafe Rust CVEs are memory unsafety--contrasted with virtually no non-unsafe Rust CVEs that are memory unsafety, and all that were are due to compiler bugs).
The overall trends tell me that in the absence of a proof assistant, however carefully you scour your code for bugs, you will miss some. And 70% of the ones you miss will be memory unsafety unless you are using a system that explicitly prevents this.
There have been a few other studies besides those two pointing to the 70% figure. It seems to be a curiously persistent figure, and I agree that it's not just about C++.
No, that is a different statistic from the one I’m talking about. The one I’m talking about was a survey of security bugs in Firefox, including the private ones. This one (and the Microsoft one) show an even higher number!
I'd imagine the great majority of security work happens in the JS engine, because that's what executes foreign, turing-complete code from every site you visit (natively via JIT, even). So one option would be to simply use V8, and only build the other subsystems from scratch. Performant (and complete) JS interpretation is probably going to be the hardest thing to implement anyway, before you even get to the security concerns.
I do not think this is really much of a concern in practice. Nobody would bother to attack a browser that isn't popular (unless it is used by someone who is a target themselves, but the attack wouldn't be for the browser but for the target and the chance of others being affected will be very very low) and by the time the browser becomes popular it will also have attracted a developer base and pairs of eyes large enough to have those bugs fixed.
Remember the claims about Mac OS X security back in the early 2000s? Mac OS X wasn't secure because it had no security issues, it was secure because nobody bothered to attack it. As it became more popular (and it had to become very popular compared to what it used to be, which took several years by itself), it also attracted people attacking it.
That would be the same story with a new browser. Or anything new and obscure for that matter.
Right, I feel like any new desktop browser that is aiming for wide adoption needs to be (for example) multi-process and privilege-separated from the start. Rust certainly makes some bugs impossible that this sort of sandboxing prevents, but not everything.
Designing these sorts of security features up front isn't fun, and makes it take a bit more time before you get to your first page render, so there's a lot of slogging to do before you get there. I can understand how someone might lose motivation that way.
Hacker News: implementing an incomplete subset of Web standards to focus on things that matter to everyone, like performance and privacy, is a great idea. Also Hacker News: fuck Safari.
Safari isn't that useful for people not using a Mac or iPhone/iPad, which is most people.
Also: don't paint us all with such a broad brush. I'm pretty indifferent toward Safari. "Fuck Safari" would imply I care enough to have a strong opinion about it, which I don't.
These kinds of comments always puzzle me. They basically come down to: "Some people on this website have one opinion, and other people on this website have another. People on this website are inconsistent."
Safari page loading is fast but from my experience the actual app is slow (on the newest MBP no less). It's the only browser where I can type something into the URL bar and press enter quickly, expecting it will take me to the top history result (e.g. type 'tw' and press enter to go to twitter.com) but actually beat out the loading of the history / bookmarks results and have Safari simply google the string 'tw' instead. This happens all the time for me in Safari and never happened once in Chrome.
If only Safari had something comparable to uMatrix, uBlock and youtube enchancer suite. But apparently the APIs that would’ve been useful for the first two are a bit crippled in the latest Safari.
Hopefully this can be fixed some day. I’d gladly switch from Firefox.
Tangential to performance, but last I used it (2018ish) Safari had by far the best power efficiency, which is important when running on battery power (iOS and MacBooks are Apple's most popular devices). It was like a 2-3X difference, doubling the battery life. I thought that was really interesting, how much they optimized for that, and how Chrome seemingly did not at all.
From what I understand (speaking casually to someone on the Google Chrome team) Safari is able to integrate into operating system level APIs that are not available for other applications, impacting especially power efficiency.
Since Chrome on iOS is mostly safari (as far as the heavy lifting goes), I expect Chrome on iOS is easier on battery than it is on Android.
My meaning was that Safari is heavily optimized for power usage because most people using it are on mobile devices, and the difference is dramatic, at least on macOS where you can compare it.
I wonder how Chrome's traffic compares across mobile, laptop and desktop (I don't think laptops can be detected in browser stats). They certainly seem to focus on maximum performance above all else.
Although afaik Safari is the only one which has TCO for JavaScript? (Correct me if I am wrong!) So while I do not like Safari, it seems to have done that right.
Could something like this be used as the renderer for desktop applications? Instead of running a full fledged browser like electron, you basically just write all the logic in rust, and render the UI with css+html.
That was very early in the project, but moreover, I would expect that simply means that Servo can allow someone else to own DOM nodes; not that it depends on a JS implementation specifically, no?
A browser engine which supports a subset of the features of Chrome, Firefox etc and can be used as a lightweight and fast alternative to Nodejs/Electron for cross-platform desktop app development could be a really useful product.
There is already a lot of lightweight alternatives for Electron: electrino, neutralino, Quark, Deskgap, WebWindow, litehtml, tomsik.cz/graffiti, yue, nodegui, etc. No need for yet another one.
You only ditch compatibility for existing content. You still keep some level of compatibility with developer knowledge and teaching material. Most websites and electron apps don't need a WebSQL implementation, yet all electron apps ship with one, bloating the downloads for all users.
I don't think WebSQL is the best example. Chromium uses SQLite for other things (bookmarks, history, etc.) and WebSQL is mostly a JS API for interfacing with SQLite so I'm guessing the overhead isn't huge (and SQLite is pretty small to begin with).
Good point, and while the average Electron app probably doesn't need history or bookmarks, there are probably use cases it would still need WebSQL for. Another example would be the ffmpeg copy it ships to play back videos of various formats, even though the electron app only plays back a few hardcoded animations that all are in a single format.
Or maybe even have those sort of features be broken out as modules. Can have a flag that says you’re using this JavaScript library or that css layout module at compile time and not ship with anything else.
Isn't part of the draw of Electron that you can essentially stuff your (presumably already existing) web app in a box that runs as a standalone app[0]? If that's the case, removing compatibility would mean losing most of the draw.
Some level of compatibility will be good to have so that a lot of the Node modules can be used as is. However a lot of old standards etc can be thrown out.
These are perhaps since the whole complexity of css and dom and box models etc. is beyond me specially how the complexity actually multiplies. But I have always found the complexity of modern web not just unsustainable or risky but frustratingly wasteful.
Browsers are a universal medium both for content and UI for app and that's great. But they are kind of the most wasteful creation ever. The most obscene apps of yesteryears would do a lot more with a lot less. An example of that is gmail taking 600MB memory (according to chrome) far more than a desktop client that didn't depend of the server doing all the heavy lifting would have taken. It would have all my mails locally, would actually be much faster changing pages (if done correctly).
But I digress. I think the complexity comes from (apart from over engineering syndrome) trying to deny that there's two very distinct use cases of web. One is nicely formatted semi static content and fairy dynamic stateful applications. I think designing a set of features (even modes in browsers that a page needs to declare for) can simplify things a lot.
The second thought I had was a brand new layout model/engine that does away with all the crafts and define some powerful primitive that works in a more well defined way that doesn't take a Google to properly implemented. Now I was thinking of two options of how to get that accepted.
1. A page can declare itself using the new primitives (now there can be two primitives for doc/app or this simpler one might handle both?) and the engine is small enough that it's worth having two in the browser. And the saving from some pages specially as it grows over time specially for well maintained/popular pages use the faster version the overall browser efficiency increases even if you can't remove the old engine.
2. Can there be a translation layer that can take a old layout page and rewrite the css/layout into the new primitive. It's perhaps slower than the current engine and perhaps on changes to content/viewport/scroll retranslate a lot but perhaps allow one to drop the old layout engine a lot faster.
Complete bonkers? I am almost certain it is but I don't know enough to tell and I have been meaning to ask to a crowd who can answer.
The complexity of the web is not just unsustainable—as of this past week and the demise of Firefox it has officially become "unsustained".
I agree it must be challenged, and I think the way to go is to create a simpler web spec with a mantra that is totally oriented around simplicity. The web seems to have died this week, and I'm not sure how else it can be reborn.
They stopped developing Servo and their dev tools. How does that not signal the end of Firefox as a non-Blink, non-Webkit browser?
Edit: I see you work at Mozilla. IMO Firefox is the most important software project in the world (not hyperbole). Are you saying it will somehow remain an independent web rendering/execution platform given this past week?
Given the current Cold War state of the Browser Wars I very much support any attempt to come up with something new that focus on speed and privacy, lots of good luck!
EDIT: I know this may be too premature to ask but does it make sense to calculate a Acid Test result these days? I have no idea, hence the question.
A browser won't fully pass all of the Acid tests if it follows a few modern changes but it can be a good way of seeing general improvements take shape and show how close you're getting to the fiddly compliance bits. It obviously won't cover all of the newer stuff but it still takes quite a bit to cover everything in the tests.
awesomekling (who's on HN, shoutouts if you see this Andreas you're awesome!) used Acid 2 to help push the development of the SerenityOS browser in some coding streams.
Indeed, the Acid tests are in fact immensely useful for bringing up new HTML and CSS implementations!
There's still a lot of work to do on the CSS box model before the SerenityOS browser engine can render Acid1/Acid2 fully.
Then we have Acid3 which will require a lot more work on the JS engine and DOM API's. But it's all so much fun that it doesn't matter how much work it takes. :^)
There's no actively maintained port right now, but jcs@ ported it to OpenBSD a while ago. His branch[1] is a couple months behind now, but it wouldn't be terribly difficult to get it working again.
To those pointing out that dillo, netsurf, and phoenix didn't get anywhere... You're not wrong. But, on the other hand, the last thing anyone thought we needed was "yet another" search engine in 1997 when google dot com was registered.
I wonder how Rust will influence the outcome of this project? Will the Rust paradigm prevent memory leaks or somehow improve on the architecture of the big hitter browsers like Firefox and Chrome?
It's super exciting to see how Rust will fare with building new iterations of existing tech.
One thing that interests me about using Rust for a project like this is modularity. The Rust toolchain makes it so easy to work with separate crates that it feels very natural to split your work at logical points. C++ doesn’t have that kind of baked in toolchain and just the act of downloading and compiling the Chromium code is a pretty intimidating prospect. If I could easily download and test, say, a CSS parsing module that could be really useful outside of regular browser contexts.
> I wonder how Rust will influence the outcome of this project?
Currently the project fails to build with the following error:
> error[E0554]: `#![feature]` may not be used on the stable release channel
This might not be exactly due to Rust, but only a personal choice made by the project's authors. However, lack of stability does not bode well for the project's longevity and adoption rate.
From what I understand, that might just be a problem with the README. You might have to set your compiler to enable experimental feature flags (thus setup a different compilation channel).
I think the point was if #feature is using the experimental channel and not the stable channel memory stability wasn't as important to the project as it was to him.
That’s quite the stretch. Rust’s unstable branch isn’t a total Wild West, it’s just for new features that aren’t ready for stable yet. If you’re not expecting version 1.0 for quite some time it isn’t a problem to use an unstable channel.
One possible use case for even a minimal html/css renderer is for UI layouts outside of a browser. There's no need to bundle a JS engine to run some code if your team is already comfortable using another language.
For layouts outside of a browser, a specialized XML format other than HTML and CSS running in a VM would be more preferable. Everyone complains about how web tech is inadequate for UI design - it's a hack, not a solution.
Those people complain because they never built anything meaty on other stacks like Android or iOS which have all sorts of warts and offer nothing as good as Elm or React much less an open source solution.
Being the most popular thing doesn't necessarily make it the best thing.
We should be looking at "native web apps" as a stepping stone, a proof of concept for the architecture and design patterns of the web, but necessarily the platform itself. At the end of the day, it's just an XML document, stylesheet and escripting language running in a VM. There's no reason it has to use a browser engine and HTML.
>It could spur a minimalist information driven trend.
it might if anybody would actually use it, but that seems unlikely if it doesn't have CSS or JS support. Chrome and Firefox are already really damn fast if you give them a simple site without much css or js. people just need to actually make sites like that.
> people just need to actually make sites like that.
This is the problem I see with initiatives like Gopher and Gemini.
The people causing the problems with the web are not the people who will listen to these initiatives.
They are banks, advertisers, FAANG, everyone who is fine making money off the standard Chrome-IE-Edge crowd and barely even care about Firefox support.
Both Gemini support and a sane subset of HTTP / HTML require the same level of dedication that I could bring, but no commercial site will. Well, to be blunt, minimal HTTP / HTML is a lot easier. I can keep my same hyper backend, Firefox client, Nginx for TLS termination, curl and libcurl and pycurl, lots of tools that will only work on HTTP.
Plus, minimal HTTP 1.1 is not that hard to implement, so I think they're mostly attacking the wrong part of the stack while also cutting out useful performance features like QUIC or pipelining or caching.
Back in the days of Netscape 4 (I think), this is what I did. I’m not sure if it was CSS or just some custom browser applied styling defaults, but I found it difficult to read pages in all different styles so I had the browser make them all the same.
There are ‘readability’ plugins and services for browsers that attempt to provide this service for current sites. These days it’s a bit more complicated than just overriding a few styles to make a page ‘standardised’.
The plugins / services work for maybe 99% of pages I look at. Unfortunately I don’t know how to view the whole web through such a lens, without having to activate for each page.
CSS and JS are not in opposition to "information driven". Javascript is not used merely for content-free flashy pages, and it isn't going away anytime soon. If and when it does, it'll be because something better replaced it, not because a million web developers woke up one morning and said "why don't we stop using most of the capabilities of modern browsers".
If, hypothetically, browsers stopped offering powerful scripting capabilities, the app makers of the world would not suddenly say "I guess we'll make static webpages", they'll say "here's how to install our all-powerful unsandboxed application". Powerful scripting on the web means more applications running in safe sandboxes.
Browsers enable websites to be very user hostile the way it is now. Of course developers wont opt for static over dynamic, unless they suspect users might want that, which could happen in at least some niches.
Most webpages, even simple ones that do not strictly need to be designed to require js, will entirely fail to render (blank page) without js. The situation has become a lot worse in the last 2 years or so.
It’s a requirement for a general purpose, modern browser, unfortunately, even one without a bunch of bells and whistles.
This is the case even for many municipal or government sites, to say nothing of business products/services/vendors. You can’t use the web as a private citizen for normal things like banking or civic participation without js.
If it is purely for speed, then there is no need for a new browser. Simply disabling JS makes websites crazy fast. I run a documentation site for my framework. It is already pretty lean, but disabling JS makes it super fast, even though my server is cheap. Unfortunately, most modern websites display a blank page without JS, so what we really need is a change in developer attitude, not a new browser (again, assuming that you are only concerned about speed).
JS is too useful to ditch entirely. I think it would be interesting to design a language that deterministically uses compute resources, and then limit web pages to a certain amount of them.
Couple that with the idea I've been toying with for a while about a new html standard (html6?, core HTML?) that only accepts loading JS from a common repository of utility code (think useful stuff like autocomplete, partial page reload etc) and it could improve the web a lot if we got sites to use it.
In a better world, JS used to implement convenient web page features like rendering LaTeX or syntax highlighting would be implemented inside the web browser, while JS used to implement web apps would not exist and web apps would just be apps.
I like my apps to run in a sandbox, and the web is the best sandbox we have. My standard reaction to "would you like to download our app" is "no, stay in your box".
We have. They run on the web, in tabs. Or, with PWAs, they look a lot more like native applications, and still run in a safe sandbox. (There are also Android and iOS apps, which are less ideal and less portable.) Why reinvent it in a less portable, less sandboxed, historically insecure manner? People have tried, and the result never ends up as useful, functional, or secure as the existing web sandbox.
When I browse the web, I know the browser puts me in control, and keeps applications contained. The only kind of app that I know will have comparable sandboxing is a PWA. Anything with a comparable amount of control will look like a web browser, and we already have the web.
If you want the world to change, you have to offer something better.
There is great technology out there for app sandboxing. Recent Windows versions let you instantly spin up a virtual machine to run unknown apps in - I'd say that's safer than a browser and there is no reason why it can't be as convenient.
You speak of the web as an app platform - I'm not opposed to some platform like that existing (and they do exist, just look at your OS), I just think that we made a mistake when we turned the browser into one. Now we mix together hypertext and code and have so much weird legacy to maintain, not to mention the performance issues.
I think the point you are missing is that we should have 2 distinct things: applications and web pages. I should not have to run untrusted code to read a blog post, a news page or the latest PR release. I run with JS off by default and there are basic web pages that don't work with JS for example https://www.bbc.com/news not sure why without JS the layout gets messedup but the content loads.
Sure if you have a nice application that is interactive, fine use a PWA, Electron or whatever you want but for showing plain text and iamges a subset of html and css is enough
Is this related to Servo,and if so how? Servo is the Mozilla team's creation of a browser engine in Rust The team recently got laid off by Mozilla but the development is apparently continuing as an Open Source project.
What does it take to hook up something like this with nodejs? This just renders the dom while you manipulate dom with js. Would this be a lot smaller and more customizable than electron?
Do you ever have websites that use the full width length of your page? I’ve never seen that except for horizontal scroll webpages or if you are on a tiny screen.
Can this be embedded into other apps or is this "just another browser"? The page says it's a browser engine, but it mentions using opengl and glut, so I'm not sure. Seeing as it only has a `src/main.rs` I assume it's an app, not a lib?
How does one get started with building a rendering engine? I would love to get some guides on this. I agree that there are inherent problems with css with how complex it is now
I’m sure this is fun and all, but the only things which will fix the web are:
1) a replicable business model that doesn’t rely on advertising/surveillance revenue
2) search you can’t game with keywords or SEO
Until those are solved, we’re all trapped doing Google and Facebook‘s work for them. No amount of changes to the languages or browsers which parse them will have any noticeable effect on the majority of the web content or its users.
The main problem is the mess that is called html/css. You can't "just" build a simple browser with limited resources anymore. And by limited i mean thousands of man-years of work backed by a multi million organization. I don't know if it is a calculated taktic from google to rapidly expanding and driving the web forward (i still hope the intention – at least at the beginning – was in good faith) so that nobody starting today from scratch could ever catch up. But i guess everybody that started to build a simple hobby web browser knows how unachievable this task is. If we ever want competition in this field again and don't want to handover the web to google, we need to start from scratch or at least start to massively deprecate stuff. And by the grave of Alan Turing start by making a website fail to render if the html is not correct.
>> And by limited i mean thousands of man-years of work backed by a multi million organization.
It's so sad. We're so trapped. Fortunately, Google (or any other) can't completely alienate a huge part of the web. So eventhough they control the browser, they don't control its content...
HN wants the rest of the world to be as angry and frustrated at the modern web as they are, but the rest of the world, for the most part does not care.
That's the harsh fact as it stands. No amount of GitHub stars, blog-posts, retweets or news coverage would help it further in its goal. I mean, the author is the only commiter, which is already a high risk of failure.
The way this will work will be similar to any project with the complexity and funds like the Linux Kernel Project: Recurring sponsorship funds in around $1M+ per month with 1,000 core developers and 10,000 external active developers at other companies also using this Rust browser.
This can also be turned into a Rust consultancy offering their services and expertise around security and Rust which also funds the development of this "Rust Browser" marketed as "more secure than Chrome".
That whole idea sounds almost unrealistic if starting from scratch, but it has worked for the Linux Kernel Project and Red Hat. It sounds more possible if it were spun out of an existing large company. But would it be open-source? I don't think so.
> Only a very limited subset of CSS is currently supported
Here's hoping that never changes, and that they never add JS support. If a browser like this released a spec for which HTML and CSS it supported, I would gladly make my static sites compatible. There's no reason sites like Wikipedia couldn't do the same.
I already build my personal sites to work on IE5 and up. If a 486 can do a competent job of drawing a page with some text and a few images. What possible gains do I get by piling more complexity on top?
I'm actually concerned that CSS or any other control over styling that is foisted upon the client is where the slippery slope begins. Once you add styling, the photoshop driven developers arrive and demand pixel perfection, then the never ending demands of more bloat and complexity follow.
I used to look at Gopher (and more recently Gemini) as too stunted to be useful, but perhaps they are right to nip these things in the bud.
I'd say something about just wanting PDFs with text reflow and hyperlinks, but apparently the PDF standard includes a JavaScript library for some inexplicable reason.
Such a simpler web spec would be relatively fast moving, not focused on backwards compatibility, but instead on simplicity of implementation. HTML would have to be written correctly (eg. balanced tags), old styling mechanisms would be removed so that layout engines wouldn't have to accommodate them. Everything would be pared down.
I believe this would open the playing field for many people to create browsers, would breath life into the now basically empty browser space and the Web in general.
Of course adoption would be a big issue, but that's always a big issue. I wonder why this wouldn't make sense to try, given the current state of affairs. It doesn't make sense to just give up on the Web. Why not re-invent it a litte?