I find it incredible that people still think HTML, a spec first published in 199...

grumbel · on Sept 6, 2022

There is nothing wrong with HTML evolving, the problem is that it has evolved in the completely wrong direction. HTML should be a document language, yet it has barely improved in that in the last 20 years and is still missing really basic features (e.g. long-form document support). Instead it can talk to my Bluetooth devices and my camera. Something went very wrong in the evolution of HTML.

It's nice that we have a portable app platform now, but I really wish that would be a separate thing from the document centric Web, which slowly but surely is getting killed and replaced by a bunch of always-online apps.

onion2k · on Sept 6, 2022

HTML should be a document language, yet it has barely improved in that in the last 20 years and is still missing really basic features (e.g. long-form document support).

I'm not sure what features you mean. Most things I can think of for long form content are right there in HTML.

Instead it can talk to my Bluetooth devices and my camera. Something went very wrong in the evolution of HTML.

That's not HTML though. That's not even really browsers and JavaScript. It's just Google and Chrome, and even then it's mainly there so it's available in ChromeOS more than the Chrome-the-browser. They just happen to share the same 'engine'. A random webpage is not hax0ring your bluetooth devices. They're very tightly coupled, so sometimes it's hard to see where one stops and the other begins, but Web Bluetooth has nothing to do with HTML. You only need look at the compatibility chart to see that - https://developer.mozilla.org/en-US/docs/Web/API/Web_Bluetoo...

grumbel · on Sept 6, 2022

> I'm not sure what features you mean. Most things I can think of for long form content are right there in HTML.

If you publish a book as plain HTML scrolling becomes impossible, as any tiny movement will catapult you numerous pages forward. You can't bookmark a scroll position either and neither can you link it. If you split the document into multiple HTML files, you complete break the ability to search across the whole document. Performance also breaks down with long form documents.

There is of course .epub, which fixes some of those short comings, but epub is not part of the Web, not supported by any browser and requires a separate app. So it really doesn't fix the fundamental problems either.

There is also 'link rel=next/prev' that in theory would allow working around some of those issues as well, but that hasn't been supported in any browser as far as I know.

If HTML would be any good at handling documents we wouldn't still be using PDF all the time.

acdha · on Sept 6, 2022

> If you publish a book as plain HTML scrolling becomes impossible, as any tiny movement will catapult you numerous pages forward.

This isn't true on any popular browser or platform. Single-page scrolling using the keyboard, mouse, or touchscreen is built-in for all common browsers and scrolling even extremely large documents has been fine for at least a decade – unless you load it up with tons of JavaScript even in the ranges of thousands of pages will be at least as snappy as a PDF.

> You can't bookmark a scroll position either and neither can you link it.

You can't, and also wouldn't want to since that's inherently unstable (e.g. I resize the window slightly and all of the links break). What you can do is what people have been doing since around 1993 and put anchors on logical sections, paragraphs, etc. so you have a linkable stable anchor which doesn't depend on the window or font size.

account42 · on Sept 7, 2022

> You can't, and also wouldn't want to since that's inherently unstable (e.g. I resize the window slightly and all of the links break). What you can do is what people have been doing since around 1993 and put anchors on logical sections, paragraphs, etc. so you have a linkable stable anchor which doesn't depend on the window or font size.

Except there is no way to expose such anchors to the user without cluttering up the display. Giving users the option to create a bookmark to the nearest id-ed element + offset (perhaps with extensions to let the website specify which elements to consider) is an example of document support that is missing from browsers.

Chrome does have an extension to bookmark text fragments: https://support.google.com/chrome/answer/10256233?hl=en&co=G... Something like this should be standardized and also include a nearby ID to help find the location when the text changes.

Most pages also don't change that significantly/often and if they do they can just disappear entirely so anchor links aren't that stable either. Most browsers already do remember your scroll position when you refresh the page because even if it sometimes gets you to a different part becaus the preceding content changed it is still useful most of the time.

acdha · on Sept 7, 2022

> > You can't, and also wouldn't want to since that's inherently unstable (e.g. I resize the window slightly and all of the links break). What you can do is what people have been doing since around 1993 and put anchors on logical sections, paragraphs, etc. so you have a linkable stable anchor which doesn't depend on the window or font size.

> Except there is no way to expose such anchors to the user without cluttering up the display.

1. Using unobtrusive anchors isn't that bad as far as clutter goes — if you have a discrete octothorpe or paragraph symbol consistently at the start or end of a paragraph it's not the end of the world, especially if they're styled to be in the margin outside of normal text flow.

2. Since approximately late 1995, people have used JavaScript to only display those symbols on hover/focus. This works well and is pretty common around the web.

I'm not saying that there isn't room for improvement such as Chrome's extension but when there's 3 decades of common usage for something you say can't be done it suggests that the probably isn't that it's impossible but that there isn't enough social pressure to make publishers _want_ to publish books as HTML. Examples of massive documents are easy enough to find (e.g. W3C specs) so I'd prefer to spend time thinking about why other people aren't (e.g. DRM) and what you can do about the real problems.

onion2k · on Sept 6, 2022

Those are all flaws in browsers, not HTML. In the case of bookmarks that's not even a part of HTML.

There's nothing stopping browser vendors solving the problems you're talking about but then you'll probably worry that browsers are becoming operating systems or something.

grumbel · on Sept 7, 2022

A bookmark is just a link and links are part of HTML. If you only workaround those issues only at the client level you aren't fixing anything, as the Web will still be crippled just the same. You need better ways to link content on the Web itself.

onion2k · on Sept 7, 2022

A bookmark is just a link and links are part of HTML.

Bookmarks are entirely part of the browser UI. They're an entry in a database of URLs that the browser presents to the user, and when the user clicks on one it tells the browser to navigate to that address. There is absolutely no HTML involved.

JohnFen · on Sept 6, 2022

I'm not arguing that the trend is a bad one, but it does come with downsides that are significant for some. For instance, I certainly don't want every random website I might visit to have access to these abilities. It just strikes me as an unnecessary security risk.

acdha · on Sept 6, 2022

What exactly is the security risk? The browser intermediates all access and you already trust the browser to handle many other sensitive operations on your behalf. There's no way for a site to activate this without your approval or covertly without a full zero-day (probably two since most OSes also have recording indicators now).

Contrast this with where we were in the bad old days: sites used things like plugins or Flash/Silverlight which had massive attack surfaces and, for years, inadequate privacy controls or sandboxing. Part of why this is in the browser now is that the browser developers realized that things were never going to get better if they left it to the disinterested developers at companies like Adobe, and now that's just a “can you believe we used to think this was normal?” historical trivia point.

JohnFen · on Sept 6, 2022

The security risk is that advanced functionality is available to all websites. Even if the browser itself is actually a perfect sandbox (and I don't think that's a claim anyone would make), it's still a security problem because a lot of mischief can be done within those parameters, such as tracking, fingerprinting, and other forms of spying.

> Contrast this with where we were in the bad old days

In the "bad old days", you could decide not to install plugins, choose which ones to install, etc. You could customize the attack surface you're willing to present. That ability is seriously constrained now.

Understand that my complaint is about web browsers allowing websites to do these things. In effect, it's allowing any random website to have the power of a natively installed application. This is a bad thing in my view because with native applications, I could decide which ones were and were not acceptable to me. That's extremely difficult now that the browser gives that power to every website.

acdha · on Sept 6, 2022

> The security risk is that advanced functionality is available to all websites. Even if the browser itself is actually a perfect sandbox (and I don't think that's a claim anyone would make), it's still a security problem because a lot of mischief can be done within those parameters, such as tracking, fingerprinting, and other forms of spying.

No, it can't. It's an HTML attribute which enables some mobile browser UI around the standard <input type=file> behaviour. The only thing it allows for tracking is whether your browser supports that attribute, which narrows it down to about 60% of the browsers in the world:

https://caniuse.com/html-media-capture

The only way you're breaking that is if you have the kind of exploit which could break just about anything and in that case the argument would be more along the lines of “we should remove JavaScript entirely” since that's been a source of orders of magnitude more security problems.

> In the "bad old days", you could decide not to install plugins, choose which ones to install, etc. You could customize the attack surface you're willing to present. That ability is seriously constrained now.

Your choice was to enable a massive operating system-scale level of functionality in Flash/Silverlight or not be able to use many popular sites. Since those plugins were managed separately from the browser they did not follow the same secure development practices or sandboxing which the browsers were using, and many users either did not update them or did so on a schedule far slower than the update schedule Chrome or Firefox kept.

> In effect, it's allowing any random website to have the power of a natively installed application.

… with complete user control. That's exactly what you say you want in the next sentence and it's far better from a security perspective because trusting a browser's sandbox is a lot easier to evaluate than having to individually review every native application you install. Installing native applications should be seen as a relatively rare activity because they expose you to more risk and are harder to evaluate.

JohnFen · on Sept 7, 2022

> No, it can't. It's an HTML attribute which enables some mobile browser UI around the standard <input type=file> behaviour.

I wasn't talking about this particular ability. I was talking about the risk of browsers being effectively operating systems in their own right. That puts me in an untenable position -- I have to determine the safety of every web site, which is a thing that is effectively impossible to do.

> with complete user control.

Not even close. I have to install and use several addons in order to muster a somewhat reasonable amount of control. And even then, the control I have is far from "complete".

> it's far better from a security perspective because trusting a browser's sandbox is a lot easier to evaluate than having to individually review every native application you install.

It's certainly easier to "trust the sandbox", except that the sandbox allows far too much nefarious activity in order to be able to trust it.

Modern web browsers are inherently problematic because of the combination of presenting powerful abilities to websites, and that it's impossible to know if a website is using those abilities responsibly (and most -- especially large commercial websites -- aren't, as near as I can tell).

acdha · on Sept 7, 2022

> > with complete user control.

> Not even close. I have to install and use several addons in order to muster a somewhat reasonable amount of control. And even then, the control I have is far from "complete".

You have per-site authorization for each different feature, and those can be one-time, ongoing, one prompt per day, etc. Now, yes, I'm sure you can come up with some policy like “I want to only allow camera access when I'm holding down the left meta key” but it's important to remember that the alternative is people downloading and running native code. Do you really think that's going to be _better_ than a modern browser?

> Modern web browsers are inherently problematic because of the combination of presenting powerful abilities to websites, and that it's impossible to know if a website is using those abilities responsibly (and most -- especially large commercial websites -- aren't, as near as I can tell).

The browsers fully disclose when something is being used. You can't tell what they do with the data but that would be the case in any alternative model and I trust the browser UI a lot more than a prompt which that company implemented in their own app.

dymk · on Sept 6, 2022

It's a shim on top of a file picker dialog. Instead of taking a picture and picking the file from its saved location, it skips a step and lets you directly supply the file from the OS's camera app.

unixbane · on Sept 7, 2022

>Why wouldn't we want to be able to [run interactive apps] web with our devices?

Because the web is for hypertext, which means static content. Images, text, video and audio clips, tables, code snippets, etc. Of course even that subset it does unimaginably bad, okay? Interactive web apps fucking suck, so that takes care of that 50%. For the other 50%, which is document viewing, why in the hell would I want pages to be able to move stuff around and create their own custom UIs and color schemes for every document I view? That use case is for magazines, a content-free medium. It's actually hilarious how people think hosting their library documentation on some stupid website like readthedocs.io, or publishing scientific journals behind IP blocklists (aka misconfigured bullshit from some charlatan sysadmin) is "progress".

onion2k · on Sept 7, 2022

Because the web is for hypertext, which means static content.

Thats what hypertext meant originally but it's evolved and moved on. The spec is literally called "HTML The Living Standard". You don't get to claim it's fixed and can't change. It just isn't.

hardnose · on Sept 7, 2022

The distinction is between growth and creep. Growth is when your spec matures to serve the original purpose even better than it used to. Creep is when your spec decides to achieve other, new purposes.

HTML has been a victim of ridiculous amounts of creep, to the detriment of growth. Creating static web content isn't really easier or cleaner now than it was in the 90s, but the protocol DOES now permit the remote end to interrogate your computer for almost as much info as the freakin' SysInternals suite.