I'm betting on HTML

p-e-w · on Aug 2, 2023

HTML is the solution to walled-garden lock-in? What? Those walled gardens already use HTML, including some of the semantic elements mentioned (plus ARIA semantic attributes, which are much more sophisticated).

> ChatGPT-like interfaces are likely the future of human data access.

And the whole point of artificial intelligence systems is that they don't require specialized "machine-readable" annotations in order to process input. ChatGPT (and its future offspring) can navigate regular websites the same way humans do. They don't need us to hold their hand. They know when a sequence of paragraphs constitutes a "list", without it having to be explicitly marked as such, etc.

What the author appears to be describing is simply an API mediated through HTML semantic elements. But if you have an API, you don't need a Large Language Model for automatic data access – a good old Python script using Beautiful Soup will do just fine. And it has the added benefit that it runs entirely locally.

bob1029 · on Aug 2, 2023

This seems reductive to me.

There's "HTML" and then there's the kind of website where the final DOM isn't known until the user has already been attempting to read it for 10 seconds. There is a substantial difference in the % of browser capabilities that need to be exercised between the extremes of use.

Complexity of implementation is what ultimately separates the good from the bad. Any tool can be operated skillfully or poorly. An apprentice with a circular saw and a fully charged battery can do a hell of a lot of damage. A master may elect to use no tool at all and simply bang on the side of the thing (I.e. push back on the business).

The latest websites I have built are some of my most compatible ever. I don't use web sockets anymore. I don't depend on JavaScript to have piecemeal conversation with the server. You can actually use ~80% of the product with JavaScript entirely disabled. How are the engineering choices demonstrated here not exactly the solution for walled garden lock-in?

feitingen · on Aug 2, 2023

Lots of websites already present a machine friendly site to google-bot and some other spiders.

I don't see why they can't offer the same to other bots and only serve the Javascript-heavy pages to humans.

flagrant_taco · on Aug 2, 2023

The idea of maintaining two entirely different versions of the same side gives me flashbacks to the days when they sir had a separate "m." codebase for mobile. There's a reason we found better solutions there, building and maintaining the same site twice is almost never worth it.

oneeyedpigeon · on Aug 2, 2023

This is why you wouldn't build "two entirely different versions of the same site", you'd build one version and toggle the <script> tag on and off.

flagrant_taco · on Aug 2, 2023

That may not work for a lot of sites that depend on client rendering and don't server-render the full page content

Its really easy for this to break and go unnoticed for a while as well. You could run tests against the static version, but I wouldn't be surprised at all to see them "temporarily" disabled because a new feature needs to go live and something is the breaking in the static tests

bob1029 · on Aug 2, 2023

> That may not work for a lot of sites that depend on client rendering

Products that depend on client-side rendering don't deserve to have regression-free experiences. You are literally doing layout with javascript and wondering why things get funny on edge case clients.

The web is fucked until it becomes truly popular to build vanilla, SSR applications again. I feel like we are almost at the end of the tunnel of client-side hell, but perhaps some aggressive final pushes could help ship the narrative.

The server is fast. Stop doing your layout on the client. Use media queries to address the wide range of viewport dimensions. You can have a responsive website, installable as a PWA on home screen of any mobile device and also as a 4k detailed desktop layout with 0 lines of JS required. All you have to do is stop outsourcing your independence to framework vendors and pick up the MDN bible.

sublinear · on Aug 2, 2023

This sounds regressive and the battle has been over for almost a decade now. Server-side rendering is silly and client-side rendering should not be used for the entire page. Even for complex web apps, the majority of the HTML is static.

The best web experiences are static HTML with client-side rendering only used for the dynamic sections of the page. It's not even a choice to do it any other way anymore if you care about a11y and SEO.

bob1029 · on Aug 2, 2023

> This sounds regressive and the battle has been over for almost a decade now

This sounds like what google would like for everyone on HN to believe.

Using "a11y" and "SEO" to push bad technology abstractions is tantamount to petty bullying in my view.

Genuinely, I don't understand the position that SSR somehow makes accessibility worse. Can you walk me through how adding more javascript on top somehow solves the problem of making a website compatible with a screen reader?

sublinear · on Aug 2, 2023

> Can you walk me through how adding more javascript on top somehow solves the problem of making a website compatible with a screen reader?

I didn't say to add javascript to make the page more accessible. I said that a static HTML page is most accessible and should be strongly preferred over any dynamic content regardless of how it's rendered. Screen readers can misannounce dynamic elements and leave the user confused about the state of the page.

But when dynamic elements do need to be reannounced due to an event, refreshing the page would be a terrible experience since the screen reader loses focus and starts back from the top of the page. Aria alerts also require javascript. It makes perfect sense that if you're pushing out an aria alert with js already that all that rendering logic should also go on the client side.

https://developer.mozilla.org/en-US/docs/Web/Accessibility/A...

As for SEO, I'm specifically talking about good metadata in the head tag, a static page that's comprehensible, and a sitemap. Static pages are better than SSR for this because SSR doesn't always respond with the same page for the same URL.

sodapopcan · on Aug 2, 2023

> It's not even a choice to do it any other way anymore if you care about a11y and SEO.

Can you explain this?

oneeyedpigeon · on Aug 2, 2023

Sure; I guess it depends if we're talking about a web site or a web app.

MrDresden · on Aug 2, 2023

The point is that a lot can be done without the need for Javascript. Also in pages served to humans.

_lqaf · on Aug 2, 2023

I don't see why they can't offer the same to humans, at least as an option.

There are a few sites I impersonate Googlebot to, they're much more usable that way.

wruza · on Aug 2, 2023

ChatGPT (and its future offspring) can navigate regular websites the same way humans do

It seems to be at least a not-yet-true claim, but let’s ignore that for now. It’s interesting if LLMs actually could do this. As I understand it, LLMs are trained on texts and source code among other things. But lacking… let’s name it a reasoning apparatus, can they really look at a DOM tree and tell what it is/does? It’s not a text, and barely can be a well-structured source code that was ever discussed (it’s a result of either bundling or componentization). This is almost on par with “LLMs will look at any .exe and be able to integrate with it immediately”.

delusional · on Aug 2, 2023

> “LLMs will look at any .exe and be able to integrate with it immediately”.

The LLM will look at any .exe and determine if it halts.

KronisLV · on Aug 2, 2023

> The LLM will look at any .exe and determine if it halts.

At that point you'd need an AGI that can figure out something we can't.

Edit: but yeah, sarcasm. Nowadays I can't even tell sometimes.

kaba0 · on Aug 2, 2023

I am 100% sure that parent poster was being sarcastic at the hype around LLMs and how many think they can solve everything, even if said thing is impossible to do, e.g. the halting problem.

Do note that the halting problem is fundamentally true, no AGI will realize some new way around, unless are mathematics are flawed to the core.

mythhabit · on Aug 2, 2023

There is some nuances here. While the general halting problem for a general Turing machine is undecidable, and with a fairly easy to understand proof as well, the computers we run today are not a general Turing machine. They are of a weaker class called Linear Bounded Automatons and for the programs they can run, the halting problem is fact decidable, on a theoretical level due to their finite nature.

So we will probably never practically solve the halting for LBAs, but the quest for this AGI that should be able to solve this, is not just day dreaming, it's rooted in the theory.

odyssey7 · on Aug 2, 2023

It’s fun to think about. What about the Collatz conjecture? If we run it for arbitrary n, on a computer that represents numbers up to n_max, we could know if it will halt within n_max steps. Since only n_max numbers are representable, we could track all visited numbers to detect any cycle that might occur. If on the other hand any iteration would exceed n_max, then the program would halt by crashing.

  hailstone :: Integer -> Integer
  hailstone n
    | n `mod` 2 == 0 = n `div` 2
    | otherwise      = 3 * n + 1
  
  collatz :: Integer -> Bool
  collatz n
    | hailstone n == 1 = True
    | otherwise        = collatz (hailstone n)

Edit: however, you would need enough disk space to store the cycle-detection index.

kaba0 · on Aug 2, 2023

We have actually done something like that up to the 32bit n_max for sure (but maybe even 64bit?), without any number that would contradict the conjecture.

But yeah, it is not even trivial to say whether it has a bounded max memory, so in case of an arbitrary precision int type, it may not be LBA, but Turing?

odyssey7 · on Aug 2, 2023

I guess I was thinking about indexing the visited numbers from a graph-traversal-algorithm-interview perspective instead of a CS theory one.

We can actually do it in O(1) space complexity in exchange for higher time complexity.

For i in range 1..n, compute n_i, the ith number in the hailstone iteration, and then continue the hailstone iteration n_(i+1)..n_max to see if n_i is equal to any of them. This takes us to quadratic time complexity, O(n * n_max), or constant complexity depending on how you look at it, but it only requires storing a single n_i at a time for cycle detection.

But then again, if you actually loop for n_max iterations without halting by reaching 1 or crashing, then you had to reuse a number somewhere, so the explicit cycle detection isn’t really important.

kaba0 · on Aug 2, 2023

I have to admit my knowledge of complexity theory doesn’t extend too far, but isn’t the solution to LBAs just.. brute forcing? Also, is it even decidable a priori whether a program is LBA vs requiring a tape that is not only linear function of its input?

vidarh · on Aug 2, 2023

The worst case solution is just brute forcing, yes.

An LBA is effectively "just" a Turing machine that has a finite tape.

A typical current computer is an LBA only if you disallow all IO of any sort or bound that IO and include it as part of the system you analyse and so fix the values which will be provided as IO, which of course is a very unusual situation, and so that constraint does not really make the halting problem more tractable in situations we usually care about.

mythhabit · on Aug 4, 2023

The naive way of deciding it it halts is executing it and keep track of all state between steps. If the machine halts, you're done, if it repeats a state, it's looping.

If you can represent a program with an LBA, it is by definition a context sensitive language and decidable. You could also show it by making a equivalent gramma for the language, that accepts the same input as the program. This gramma must be constructed in a certain way, and then you know it is a context sensitive language.

vidarh · on Aug 2, 2023

A machine we have to today in isolation with no form of IO are not general Turing machines, but almost all of them do have that and so if you consider them in isolation rather than the full system (which would include all sources of inputs) the halting problem applies. E.g. code to the effect of "while gets() {}" will either halt or not halt depending on the input, but which isn't decidable without knowing or constraining that input.

We can certainly look for AGI that can do better at deciding the halting of decidable programs, but even for current computers the general halting problem is undecidable without adding artificial constraints.

mythhabit · on Aug 4, 2023

IO is just state and can be represented on the tape. You can construct a machine for every conceivable input at every time during the execution and reason about them.

vidarh · on Aug 4, 2023

"while (gets()) {}" is undecidable because the "tape" is conceptually infinite.

So, no, you can not construct a machine for every conceivable input without imposing an artificial constraint on the input size.

Put another way: For every tape you construct and analyse, there is a tape one segment longer that might contain a symbol that can alter the outcome.

mythhabit · on Aug 7, 2023

You can, theoretically, keep constructing machines for every input you need. Yes you are bounded, but you can say "this machine will halt for every input it can receive in the next 100 years", because you need to quantify the input in blocks of time.

kybernetikos · on Aug 2, 2023

The halting problem is absolutely solvable in some cases, it's just the general case in which it is unsolvable. If LLMs were able to decode web pages and executables in a high proportion of the interesting cases, that would already be extremely useful. Humans can't do this in the general case either, but we still hire them to do jobs like this.

kaba0 · on Aug 2, 2023

> If LLMs were able to decode web pages and executables in a high proportion of the interesting cases

Why would it be extremely useful?

treprinum · on Aug 2, 2023

Halting problem is one of the problems that arise from precise mathematical definition and its outskirts. LLMs are all about making sense from unstructured text, i.e. lying on the opposite end of the spectrum where math has no direct way to do anything. So while the poster was sarcastic, they also missed the mark.

kaba0 · on Aug 2, 2023

The halting theory is analogous to Gödel’s incompleteness theorems and these are fundamental truths to any system. That matrix multiplication that deep learning does is hardly immune to that.

thanatropism · on Aug 2, 2023

What. There are no proofs of convergence for many of the most popular NN optimization algos. IIRC Adam is known not to converge in some cases.

The bitter lesson is that Messy AI is better able to cope with Messy World Problems than Neat AI (by light-years at this point), not that it can hack Neat Problems.

treprinum · on Aug 2, 2023

They are fundamental truths to systems based on certain assumptions we are taking for granted, e.g. binary logic (and not e.g. quantum logic), Turing-like computing model etc. Not that deep learning has anything to do with those, but it excels in human-like properties where simple formula-derived math descriptions fail all the time.

kaba0 · on Aug 2, 2023

Is a NN a computable function? Yes, as we calculate them on Turing machines. Then it is prone to every limitation of computable functions.

Humans are also limited by the Turing model’s limits, we can only ever determine computable functions as well. With all due respect, it is stupid to assign more capabilities to ML than what we know is fundamentally the limit..

treprinum · on Aug 2, 2023

While that's true, halting problem is completely useless in the real world (nobody designs user-facing apps while thinking about whether the program halts), whereas picking data from speech is a much more useful one that was long unreachable for "clean" closed-formula math.

kaba0 · on Aug 2, 2023

Ok, and it is completely irrelevant. Besides, guess what enables the training of those neural networks? I’m fairly sure gradient descent has a bit to do with mathematics’ closed-formulas.

Of course ML has use cases where traditional tools are less fit, my gripe is the hype-based anti intellectual nonsense that often surrounds it. They are no magic tools, the fundamental limits these giants of math/CS discovered still apply to them and we can save ourselves from a lot of pain if we don’t bother solving unsolvable problems.

treprinum · on Aug 2, 2023

Yeah, hype is driven by business and marketing who want to sell more of a new thing pointing out what was not possible before using all kinds of silly arguments. Still, there is some noticeable progress there (compared to e.g. crypto that outside logistics and large-scale fraud didn't bring much despite being based on super solid number theory concepts).

thanatropism · on Aug 2, 2023

I remember the hype around XGBoost and Kaggle contests asking to solve problems in prime number theory.

reichstein · on Aug 2, 2023

I think KronisLV's point, presented ironically, is that not even an AI superior to human reasoning, which can figure out things we cannot, can decide the undecidable.

Being undecidable is a hard limit, not something that requires better algorithms. Another phrasing of it could be: No finite program can decide it. (And what even is an infinite program? Not something we can run on any current computer.)

If an AI is itself a finite-sized program, something we run on computers, it cannot possibly solve the halting problem.

And _any_ non-trivial property of programs is undecidable, so an AI "integrating with any .exe file" isn't really meaningful. It's just words.

vidarh · on Aug 2, 2023

> Another phrasing of it could be: No finite program can decide it. (And what even is an infinite program? Not something we can run on any current computer.)

The complexity of the program trying to decide is not really the issue. An "infinite program" could add an infinite number of extra rules to try to cut down on the time taken to determine if a decidable program halts, but the issue with the halting problem is that there is an infinite set of undecidable programs where the size of your detector will make no difference to your ability to decide them.

E.g. "while next_symbol() == some_arbitrary_symbol {}" is undecidable unless you add constraints on the length or contents of the input.

Even if you haven infinite-sized program you can't decide whether or not the unconstrained version of that halts, because deciding it is equivalent to being able to determine if an infinite tape contains a given symbol, and no matter how long you scan the tape the symbol can always be the next one on the tape after the last symbol you scan.

> And _any_ non-trivial property of programs is undecidable

I don't think I agree with this without adding the qualifier "in general". There are a whole lot of useful properties we can decide, but often the properties will have constraints. E.g. for a whole lot of programs where we can't decide whether or not they will halt, we can still decide whether or not they will halt assuming certain properties of their inputs. E.g. we can decide the property of my pseudo-code above that it will halt IF "some_arbitrary_symbol" is in the input. We can also decide the property that whether or not it will halt in general is undecidable, and that is itself useful to know, because for many programs knowing what makes them undecidable is useful in order to suggest e.g. adding timeouts, or ensuring there are ways to bail early from certain actions without restarting the machine or killing the program.

For a whole lot of problems we also do not really care whether or not a given property is decidable. We care whether they're decidable often enough within certain time constraints, and that's a very different ballgame.

wruza · on Aug 2, 2023

It has meaning. By looking at a table in e.g. getopt() call and an output of --help I can often infer modes of a program that I probably need, to do my job when someone asks me to. Whether this getopt() gets called or if a program does what it claims in usage() is not my concern. I was commanded to guess the usage and I do it without burying into the halting problem. And so does that hypothetical AI.

Decidable or undecidable is about maths, not practice.

wruza · on Aug 2, 2023

I believe it was a positive sarcasm in gp.

chromoblob · on Aug 2, 2023

.exe's (for the usual architectures) run on finite-state machines, not Turing machines. So, indeed, that is possible.

If the .exe connects to a service that runs on something more powerful than finite-state machine, though, I don't know.

kaba0 · on Aug 2, 2023

That’s still a stupid pedantry, and is even false.

A Turing machine only makes use of at most n cells of its tape after n steps - so running it for a finite number of steps is possible even in finite memory. Especially that modern computers can do arbitrary side effects, having access to the whole universe as tape, which is still finite, but so is time.

There is no way to differentiate between a magical Turing machine with infinite tape and a “fake” one that has n-sized memory under any program that takes n steps, so for all practical purposes they are identical.

chromoblob · on Aug 2, 2023

Why are they identical? A program may use more steps than memory.

kaba0 · on Aug 2, 2023

A Turing machine can either go left or right on its tape (some versions have a stay step as well, doesn’t matter). If all your program does is step right forever, than it will use the maximum amount of memory, but only ever a finite amount, that is equal to the number of steps taken.

So if you don’t have infinite time (you don’t have), and you have big enough memory for the particular use case so that you don’t get OOMKiller involved, then you have a Turing machine for all complexity theoretical and practical purposes, especially that RAM is not analogous to the Turing machine tape - your computer has much more state than only its memory, if it has network access, you can basically make use of a cloud vendors whole army of servers as storage, just as an example.

chromoblob · on Aug 2, 2023

The reason for undecidability of halting is that a program's state may grow without bound. If a program's memory usage is bounded, its halting can be determined. So the computer is not like a Turing machine since its programs don't use Turing machine's key feature - infinite tape.

Although the naive way of deciding halting requires exponential time in program's memory bound, AGIs will speed that up for many programs by using clever math.

abwizz · on Aug 2, 2023

i suspect this is because they can divide by zero

tyingq · on Aug 2, 2023

"ChatGPT (and its future offspring) can navigate regular websites the same way humans do"

I don't know how much deeper it goes, but it does have some context:

  Prompt: Given this html, what does an end user see?
  <div style="display:none">hello</div><div>you</div>

  ChatGPT:
  An end user would see only the text "you" on the webpage.

  The first <div> element has the inline style display:none, which means it is set to be hidden, and its content "hello" will not be visible to the user. On the other hand, the second <div> element has no specific display style, so it will be visible, and its content "you" will be displayed on the webpage.

kevinstubbs · on Aug 2, 2023

> can they really look at a DOM tree and tell what it is/does

Yes, if you encode the DOM as a list of options for ChatGPT to choose from. In fact I developed a proof of concept of this for a client. https://jarvys.ai/ although they seem to have pivoted from automating just the browser to automating all software.

rezonant · on Aug 2, 2023

Well if the DOM is all unstructured divs with no semantic information, can a human even tell what it means without applying the structural styling on the page?

A good example would be a misguided approach at making a bunch of labels with values that are aligned. Someone told this poor developer that <table> is bad, so they figure hey, let's use CSS to lay it out. They make a dictionary of the key/value pairs and iterate over all the keys in the first column into the first div and then output all the values in the second div.

div - label 1 - label 2

div - value 1 - value 2

If there's 100 key/values it's going to be hard for a human to figure out which value is for the 76th item, and LLMs have proven to be very bad at indexing problems like that so I wouldn't expect it to be a better story there.

(Not saying this wouldn't work in some cases, just couldn't be a general solution given the crap out there)

wruza · on Aug 2, 2023

if you encode the DOM as a list of options for ChatGPT to choose from

Not sure if I understand this, does it mean you have to pre-cook DOM in a specific way? If yes, then isn’t the answer to my question “no”, like “no, it can’t take any site and use it as is”?

kevinstubbs · on Aug 2, 2023

You have to give GPT an objective, like "find an apartment in Florida" and then say something like "given the following options, which one would you interact with to get closer to your objective."

So if you assume that you start on google.com, then your options are like 1.) Input with name "search", placeholder "search anything", value "" 2.) Button with label "I'm feeling lucky" 3.) Button with label "search"

Obviously, doing just one of these doesn't achieve the objective - it just needs to pick which one it thinks has the most "value" for completing the objective. If you repeat that enough times, then it can actually do what your overall goal of the session was.

I'm just giving a simplistic answer, and if you implemented only what I've written, then it's going to get stuck in a loop more often than not. But that's the gist of how you could encode the DOM into something that GPT can interpret and make decisions/take actions based on.

TeMPOraL · on Aug 2, 2023

Remember HATEOAS? I have a feeling LLMs would excel at navigating proper REST (not "RESTful") APIs - HATEOAS is, in principle, just what you did here: providing a list of possible/useful next steps along with the response.

In fact, the problem of HATEOAS is exactly what LLMs seem to be good at - inferring the interface at runtime, from dynamically received metadata. This should even be easy to try in practice today - HATEOAS can be trivially mapped to the "function calling" feature of OpenAI's GPT-3.5/GPT-4 APIs.

wruza · on Aug 2, 2023

Got it, thanks!

datagram · on Aug 2, 2023

There's already companies working on this: https://axiom.ai/

I would guess that it's just a matter of converting the DOM accessibility tree into text descriptions, e.g. "There is a button that says 'Start'" And then converting text like "Click the Start button" back into an actual action on the page.

3cats-in-a-coat · on Aug 2, 2023

LLM would find it easier to browse through a browser. Like us. Implying image model linked.

rezonant · on Aug 2, 2023

I got the sense that the article was advocating for using good semantic machine readable content over HTTP instead of (or at least in addition to) JS-only div soup so that automated agents like the new wave of LLMs can easily pull out the important details without spinning up a headless browser and rendering the page first.

I have the same interest but for the purpose of crawling and upstart search engines. If indexing every page required running the page in a headless browser first, the barrier to entry for new search engines is a lot higher.

mmahemoff · on Aug 2, 2023

Valid point. Ironically the main benefit of semantic markup is now an abstraction to help the human developers effect styling and control.

jraph · on Aug 2, 2023

Isn't the main benefit of semantic markup still accessibility?

mmahemoff · on Aug 2, 2023

That’s one of the main benefits, but if the machine can make sense of the content, it can still present it however is clearest for the user.

oneeyedpigeon · on Aug 2, 2023

But explicit, human-verified metadata is always going to beat inferred, fuzzily-extracted data, surely?

mmahemoff · on Aug 3, 2023

Maybe in some cases, but that requires time and expertise. It may not be worth it in many cases.

theteapot · on Aug 2, 2023

> ChatGPT (and its future offspring) can navigate regular websites the same way humans do. They don't need us to hold their hand.

It can?! It does that?

andai · on Aug 2, 2023

ChatGPT had a Browser Plugin (via the Plus subscription) but last I checked it was removed, (possibly because it cannibalized Bing? Or just the reliability issues).

jack_pp · on Aug 2, 2023

In its current form, not as far as I know. Even the OpenAI tutorial on this matter scrapes first and uses only normal text.

https://github.com/openai/openai-cookbook/blob/main/apps/web...

abwizz · on Aug 2, 2023

> They know when a sequence of paragraphs constitutes a "list", without it having to be explicitly marked as such, etc.

that's the thing; they don't know - they guess

circuit10 · on Aug 2, 2023

Humans do the same, it’s usually obvious from the context

abwizz · on Aug 3, 2023

i guess there is something to be said in favor of humans and computers meeting half way

fauigerzigerk · on Aug 2, 2023

AI in general could "see" a website like we see it. But LLMs specialise in text. They will find it much easier to figure out the correct semantic interpretation of tag soup than deriving the same information from the rendered output like we do.

Also, the big question is cost. I think semi structured text will forever be far cheaper for an AI to process than a completely unstructured data stream representing visual information.

yellowapple · on Aug 2, 2023

> And the whole point of artificial intelligence systems is that they don't require specialized "machine-readable" annotations in order to process input.

I don't require a cup to hold my drink. That doesn't make cups useless or undesirable.

Point being: if a machine can make sense of the veritable clusterfuck that is the "pile of infinitely-nested divs" status quo, then surely it'd have a much easier time making sense of pages that actually use HTML properly. If you're an AI trying to figure out how far along something is, which is gonna be a more obvious indicator?

    <div id="reactElement420" class="wangularClass69"><div class="25-long red" visibility="hidden"/><div class="50-long yellow" visibility="hidden"/><div class="75-long green"/></div>

    <progress max="100" value="69">69%</progress>

The first example is hyperbolic, sure, but only slightly.

kennyloginz · on Aug 2, 2023

You don’t need a good old Python script using Beautiful soup, you can use an LLM with the added benefit of not dealing with that cruft.

selfhoster11 · on Aug 2, 2023

An LLM is much slower and requires more resources at runtime than Beautiful Soup (if you are hosting yourself), and could fail to retrieve data. It's a worse solution unless your HTML parser/scraper is broken by website changes.

This is only the case for now, however. I expect this to change in favour of LLMs as time goes on and they become easier to deploy and better at their job.

fennecfoxy · on Aug 3, 2023

Completely agree with you. The magic of LLMs is being able to chuck it an unstructured mess of data and have it parse and interpret this data in some semblance of the way that a hooman would.

LLMs will only get smarter, and we can only guess what comes after the transformer model anyway. LLMs atm have the very obvious problem of appearing smart but not quite getting all the way there; I think this is pretty much down to it simply trying to predict the next token at its core. Perhaps we'll see something smarter when figure out that an LLM should only be used for output of semantic natural language and not ideation/conceptualisation (which should be handled by a separate model that deals in abstract concepts).

ric2b · on Aug 4, 2023

But if they first need a browser to chew through the usual tens of megabytes of javascript so that they can finally get the 100KB DOM to parse through, they'll be way less efficient (and often run into the token limits) than if the page has some nice semantic HTML with some optional JS for interactivity (that the AI doesn't need).

jerf · on Aug 2, 2023

Article's first sentence: "With the advent of large language model-based artificial intelligence, semantic HTML is more important now than ever."

I think the sentence "With the advent of large language model-based artificial intelligence, semantic HTML is less important now than ever." is far more defensible. The semantic web has failed and what replaced it was Google spending a crap ton of money writing a variety of heuristics equipped with best-of-breed-at-the-time AI behind it. As AI improves, it improves its ability to extract information from any ol' slop, and if "any ol' slop" is enough, it's all the effort people are going to put out. Eventually in principle both the semantic web and that pile of heuristics are entirely replaced by AI.

(Note also my replacement of LLM with the general term AI after my first usage of LLM. LLMs are not the whole of "AI". They are merely a hot branch right now, but they are not the last hot branch. It is not a good idea to project out the next several decades on the assumption that LLMs will be the last word in AI.)

zagrebian · on Aug 2, 2023

Are you suggesting that AI will solve web accessibility, which is based on semantic HTML and ARIA? Because if not, humans will still be required to ensure that web content is accessible, and in that case semantic HTML remains important.

jerf · on Aug 2, 2023

Actually, that sounds like one of the better startup ideas I've heard around AI. Automated accessibility compliance (or something close to it) would be very useful and definitely something people would pay money for.

I fear LLMs are only about 80% up to the task, though, which is actually a very unpleasant place to be in that curve; sort of the moral equivalent of the uncanny valley. Whatever comes after LLMs though, I bet they could do it, or get very close.

zagrebian · on Aug 2, 2023

80% sounds way to optimistic to me. The problem is that screen readers (and other assistive technology) have bugs and different behaviors, and some people use older versions of those tools with even more bugs and quirks. The only way to make sure that a website has a high level of accessibility is to perform manual testing in different environments. I don’t see how AI can solve this problem. And the people who perform the manual testing need to be experts in semantic HTML and ARIA to be able to identify problems and create reports. That means that semantic HTML remains important.

coldtea · on Aug 2, 2023

>80% sounds way to optimistic to me. The problem is that screen readers (and other assistive technology) have bugs and different behaviors, and some people use older versions of those tools with even more bugs and quirks. The only way to make sure that a website has a high level of accessibility is to perform manual testing in different environments.

That's if you want actual accessibility support on a wide range of old and new devices.

But the business idea the parent proposes is automated accessibility for compliance, which is the real thing that could be sold, and has a much lower bar.

jerf · on Aug 2, 2023

My estimate of 80% included 6-12 months of serious development first, and a certain amount of budget for manual intervention for the first several dozen jobs. Certainly just flinging HTML at ChatGPT as it stands today would do nothing useful at all. Providing manual testing could easily be done as part of a higher service plan. Not only is there no rule that a startup using AI has to be just in the form of "throw it at the AI and then steadfastly refuse to do anything else", that's probably a pretty good way of filtering out the ones that will make it from the ones that won't.

Do assistive technologies have more "bugs" and "quirks" and "different behaviors" than natural text? I don't really think so. In fact I'd expect they have qualitatively fewer such things.

Semantic HTML would be important in this case... but it would be important as the output, not the input.

This hypothetical startup could also pivot into developing a better screenreader pretty easily once they built this, but there would be a good few years where an AI chewing on the HTML and HTML templates in use by a server would be practical but you can't expect every assistive technology user to be using a 64GB GPU to run the model locally. Certainly that would factor into my pitch deck, though.

I'd give more credence to the "it has to be perfect to be useful at all" argument you're going with here if it weren't that I'm pretty sure every user of such technology is already encountering a whole bunch of suboptimal behavior on almost every site today.

jakubmazanec · on Aug 2, 2023

LLM could transform "bad HTML" into good HTML; add ARIA tags, add image captions, etc.

zagrebian · on Aug 2, 2023

Unless it’s 100% reliable or near 100% reliable, you’d still need manual testing. Right now, automatic accessibility testing can’t even detect most accessibility issues. So we haven’t even reached the stage where all issues are detected by tools, and probably never will. Fixing all issues automatically is significantly harder than detecting them.

tuckerman · on Aug 2, 2023

Given how bad accessibility is, it seems like even something imperfect could be a big leap forward for a lot of sites.

coldtea · on Aug 2, 2023

>Unless it’s 100% reliable or near 100% reliable, you’d still need manual testing.

Not unless:

(a) it's X% reliable now, and it would be Y% < X% if done via LLMs.

(b) businesses actually care for increased reliability, and not just for passing the accessibility requirements.

Most businesses could not give less f...., and don't do "manual testing" today either. Just add the token required tags. That's true even when they do business with the government (which mandates this even more highly).

LLM-driven accessibility info would be an improvement.

javajosh · on Aug 2, 2023

The idea generalizes. Imagine an archiver which applies a transform to a site. Adding semantic markup - or censoring parts that someone finds offensive. If the original author agrees, they might offer an api so the transformation is linked to by the original. Or perhaps the transformer could make an agreement with/fool Google into linking to their version rather than the original. Perhaps because it's "safer".

Oh yes, a great startup idea.

telotortium · on Aug 2, 2023

Someone's on it already (but maybe there's room for competition, if https://adrianroselli.com/2020/06/accessibe-will-get-you-sue... is any indication): https://accessibe.com/accesswidget/artificial-intelligence

makapuf · on Aug 2, 2023

An LLM-based accessible browser could render 80% of the Web accessible at once it the tech works.

bschne · on Aug 2, 2023

If semantic HTML is important for accessibility and for software to be better able to parse information out of it, and AI solves the latter, semantic HTML is now less important because some of the use cases that needed it previously no longer need it. If you take "less important" as a moral/value statement instead of in terms of total utility provided, and assume that AI will have zero accessibility benefits, it will merely be as important as today, which is still at odds with the assertion of the original article that it would become more important. N.B. this seems doubtful, given how e.g. you can now past a bunch of code into an LLM and ask it questions quite naturally -- something I can easily see adapted to e.g. better navigating apps using only voice and screenreaders.

miki123211 · on Aug 2, 2023

This has been tried and doesn't work, which doesn't mean it will never work in the future! There are a few companies offering solutions in this space, but they don't work, are often worse than the problems they're trying to solve and are a privacy disaster. The companies peddling them often engage in shady business practices, like falsely claiming that their overlays can protect you from ADA lawsuits[1], while actually suing the people who expose their lies[2]. Most accessibility practitioners and disabled users themselves are warning the public to avoid those tools[3].

[1] https://adrianroselli.com/2020/06/accessibe-will-get-you-sue... [2] https://adrianroselli.com/2023/05/audioeye-is-suing-me.html [3] https://overlayfactsheet.com

jpochtar · on Aug 2, 2023

AI will solve web accessibility by screen readers that summarize visual content, ignoring ARIA and making it irrelevant. Multimodal GPT-4 can take a screenshot jpeg and answer questions about what’s in it (buttons, links, ads, headers, etc). The future of accessibility is rendering DOM to jpeg and asking GPT to be your eyes; we’ll look back on semantic markup as a failed idea that was never going to work

asdfgeoff · on Aug 5, 2023

I am curious about what post-LLM SEO is going to look like.

> The semantic web has failed and what replaced it was Google spending a crap ton of money writing a variety of heuristics equipped with best-of-breed-at-the-time AI behind it.

Arguably, there were insufficient incentives to fully adopt semantic HTML, if your goal was just to have the most relevant parts of your content indexed well enough to get ranked.

> As AI improves, it improves its ability to extract information from any ol' slop, and if "any ol' slop" is enough, it's all the effort people are going to put out.

If the goalpost shifts from “getting ranked” to “enabling LLMs to maximally extract the nuance and texture of your content”, perhaps there will be greater incentive to use elements like <details> or <progress>. Websites which do so, will have more influence over the outputs of LLMs.

Feels like the difference between being loud enough to be heard vs. being clear enough to be understood.

moritzwarhier · on Aug 2, 2023

> The semantic web has failed and what replaced it was Google spending a crap ton of money

Aren't schema.org and Wikidata/Wikipedia still powering most of Google's rich search results?

I heard them announce the new result page with bard but I probably didn't see it because of ad-blindness or it's not yet releases in my location, have to look this up...

coldtea · on Aug 2, 2023

>Aren't schema.org and Wikidata/Wikipedia still powering most of Google's rich search results?

Were they ever?

moritzwarhier · on Aug 2, 2023

Well schema.org was not referring to an organization or entitity, but its published schemas. I'd argue these were and are driving a lot of rich results, especially for local businesses.

krainboltgreene · on Aug 2, 2023

chickenfeed · on Aug 2, 2023

AIs are magic to me. The pattern recognition feature of human I've always thought pretty unique and hard to replicate. We use it when scanning the slop on websites to do some kind of data extraction. I was part of the semantic web camp in my brain, but you are right, if machines can seemingly make sense of the slop then why bother?

tomlue · on Aug 2, 2023

agree so much. Projects that aim to build a data resource and then let AI use that resource are missing the point. The AI is the data resource.

Some projects claim that knowledge graphs or other data assets can help the AI retrieve 'true' knowledge. Personally, I believe that the better approach is to develop methods that allow AIs to create their own data assets, the weights in their networks is one of those assets.

The question of truth is still a very hard one. How do you tell an AI that some knowledge is more trustworthy than other knowledge? People have this issue too though.

jerf · on Aug 2, 2023

While the issue of "truth" is interesting and important, it is also fairly orthogonal to the task of simply extracting what a given page or bit of content claims. (Perhaps not 100% orthogonal in the absolute limit, but generally so.)

As absolutely hard as I have gone against the semantic web community at times over the post few years, I do not in the slightest hold a failure to "determine truth" against them. I consider them to have been tilting at windmills as it is, criticizing them for failing to conquer that windmill, which humanity has been jousting with since the dawn of recorded history (and probably beyond), would be a degree of cruel I couldn't entertain. :)

__loam · on Aug 2, 2023

If you're relying on a stochastic process like network weights to encode truth then I have some oil to sell you.

spothedog1 · on Aug 3, 2023

what projects are trying to use knowledge graphs to retrieve truth? i've been playing around with that approach. how do you encode your own "truth" that may be different from anothers?

krainboltgreene · on Aug 2, 2023

> The semantic web has failed

literally by no metric is this true other than tech bros saying it on HN. The entire internet is powered by websites using semantic markup and clients querying it.

klardotsh · on Aug 2, 2023

I had heard of almost none of these HTML elements, and that's such a shame, because they could seriously help put the "we need JavaScript for every gosh darn thing" ecosystem to an end (or at least return JS to what it was originally meant to be: a way to add some flair, some interactivity, some whatever, but not necessarily a replacement for all of your markup and a full-DOM manager).

I'm starting to think my dream browser might be something like visurf https://sr.ht/~sircmpwn/visurf/ but with the underlying Netsurf engine updated to support various modern HTML+CSS, such as these elements. I bet you could have a nearly JS-free smolweb through that browser that:

1. is more accessible (in the "doesn't break screenreaders, system theming, keybinds, etc" way)

2. could be made to use way fewer resources than these heavy JS contraptions these elements can replace

3. would still be able to do most things we expect the median web app of today to do (sure, fire up Firefox for WebGL or whatever still, but I could see, say, a Matrix client potentially needing only a smidge of JS (largely for WebSockets and E2EE stuff) over top of very-modern HTML)

Gigachad · on Aug 2, 2023

In general the issue with these built in components is that you can't theme them. And they stick out like a sore thumb when you get a windows 7 style component in the middle of a modern looking app.

They also have basically no extensibility so when you inevitably need to do something half complex, you have to scrap it and start again with JS. So you may as well have just started with JS which just works, gives you full power, works identical on all systems, etc.

In the end all these extra components just end up as bloat all browsers have to implement but no one uses.

klardotsh · on Aug 2, 2023

Unfortunately this is a case where we'll have to agree to disagree. Half the time with Electron apps I wish I could disable CSS completely and just use my system theme because it sucks less than whatever the designer came up with for that app (the definition of "sucks less" falls into many axes that vary per application and context; no point in digressing far into that), so what you described would be a feature, not a bug, to me.

(Further, I basically never, ever want an app to theme itself. Ever. If I set a system theme it's because I want the system to look like that. I've gone on tirades here and on other forums for years about finding https://stopthemingmy.app/ and even just the freedom of every electron app to pick its own HIG and UX as absolute heresy, so if nothing else, my opinions are consistent here.)

doix · on Aug 2, 2023

I couldn't agree more, unfortunately we are a minority :(. Before react was mainstream, you could "fix" websites pretty easily with things like greasemonkey, but even that is super painful now. CSS modules mangle classnames, so every new version (which are pushed multiple times per week/day) will break your naive css modifications. You can't naively modify DOM elements because React will overwrite them almost instantly.

I know you can write regex in your CSS selectors and use MutationObserver's to update the DOM after react and co are done with it, but it's just so much more painful. Something that used to take maybe 1-2 hours to get some site working/looking how you like it, is now a part time job.

wil421 · on Aug 2, 2023

This is what really bugged my about “portable” Java apps way back. Java Spring was especially noticeable. It stuck out like a shore thumb compared to the regular system UIs.

johnnyworker · on Aug 2, 2023

Some like Magic User Interface for the Amiga, but across Operating Systems, and which provides the resulting stylesheet to the browser (with no ability for sites to override it unless I allow it) would be my dream and a marketers nightmare. Just information and media, displayed how each person prefers them to be displayed (and obviously with a lot of user made themes for people to browse and try out). With sub-configs like super-compact, whimsical etc. that users can apply to individual apps and sites.

klardotsh · on Aug 2, 2023

Unfortunately, the marketing and branding departments would all have aneurysms over this, but I do love and share your dream.

mixmastamyk · on Aug 2, 2023

SKINZ all over againz!

Sjeiti · on Aug 2, 2023

JS does not 'just work'. This is why a lot of these custom components have bad touch interaction and no accessibility. Take the datepicker; the native mobile version works great, why annoy users with a custom component?

wruza · on Aug 2, 2023

why annoy users with a custom component

Because a system you’re developing may have specialized modes. There’s no “today”, “yesterday” or “last week” or “q3” and other suitable shortcuts in standard date/period peekers. Another method is to use a text field which parses itself into a date or a period. E.g. “2-5” means (and/or expands into) 2023-08-02..2023-08-05. “May” means 2023-05-{01..31}. And so on.

My users always appreciated these buttons and modes because they were working in accounting and picking dates from that stupid standard picker was an ordeal. “ / / “ pattern is also annoying because you have to be precise with your cursor.

That said, ios picker is great, and it’s unnecessary to replace it. But (1) it’s not the only useful mode of operation, (2) it wasn’t always great on all platforms, and (3) html attributes usually suck at describing what patterns and use cases you want and compatibility among browsers is a minefield. I mean not only dates here, also numbers and ~numeric fields.

ravenstine · on Aug 2, 2023

That's totally on point, but I think the core issue is less about "why the native date picker isn't always appropriate" and more "why do we keep half-assing non-native alternatives?"

The way I see it, so much of the web is a clunky mess precisely because software development today pretends to be engineering while simultaneously being about the bottom-line and little else. No doubt, a great date picker could be developed in JavaScript that would serve everyone's needs, be totally accessible, and not a bowl if <div> soup. So why don't we do it? Why are what should be basic HTML forms on corporate websites difficult to navigate or in some cases fundamentally broken, requiring workarounds? Nobody is interested because solving real problems doesn't carry any of the prestige of building another framework. Who wants to build a date picker that is standards compliant when you could write another web framework, bro? Even if a developer is not trying to build the next React, they're probably spending more time on their toolchain than actually coding. It's gotten so bad that seemingly every company I've joined in the last 6 years needs a bunch of people dedicated to maintaining toolchain and CI crap for the rest of the team.

I love programming, but the web needs to get its head out of its own ass. We're acting like our jobs are more important than the value the software delivers, and more effort is being put into making sites impractical for machines to parse (because muh intellectual properteh!) than in making web components that aren't riddled with bugs.

wruza · on Aug 2, 2023

There’s a little more harshness than needed in your comment, but I generally agree with it. Having brought this up before, I’ve usually seen either no or strange reactions to it. It feels like web dev consists of people who only have done their job for an unknown faceless client sitting behind layers of teams and toolchains. Driving to a specific person, listening to their brutal feedback on your system and being asked to maybe fix it right now would be a sobering experience to some of them.

Sjeiti · on Aug 2, 2023

You're absolutely right on the 'why' part, but sadly a lot of custom implementations are annoying in terms of UX and accessibility. My only point here is that building proper custom components is far from easy, it takes a lot of time and effort.

yyyk · on Aug 2, 2023

>Take the [HTML] datepicker

* Lets you enter nonexistent dates like 31/2

* Can and often does accidentally place the user in American-style MM/DD format where it should be European-style DD/MM (I have a replicable case now on that page example).

* No ability to force date style by design. So there's no way to fix the above from the server, or to use ISO-style dates. Only way to reliably prevent MM/DD by default is to fix every client configuration - not very likely even in small companies.

* No way to have the datetime dialog open by default.

* Poorish but getting better keyboard support (the pagedown-up keys finally work in most browsers, but once you've opened the dialog you can't enter a new date with the keyboard).

* Timezones must be handled separately, which is just poor design.

(Entire list checked on desktop)

berkes · on Aug 2, 2023

> ...place the user in American-style MM/DD... > No ability to force date style by design.

There is datetime-local, date, and time. And there's a lot of control over what is allowed with min-max ranges, steps, etc.

The only thing I can imagine to go wrong here, is when a user has their browser set in US-en but when they are not aware of that. Which seems... weird; or at least not a problem a web-dev should solve.

> Lets you enter nonexistent dates like 31/2

This may be an issue in specific browsers/versions/os. But enabling the "validation" by setting required and/or some other attributes, gives an error for these dates AFAICS. But, in any and all cases: server-side validation is needed anyway. You just cannot trust a value sent by a user, whether that's "validated" with sixty npm-packages and their dependencies, or by the browser.

yyyk · on Aug 2, 2023

>The only thing I can imagine to go wrong here, is when a user has their browser set in US-en but when they are not aware of that. Which seems... weird;

Legacy Edge used to look at keyboard locale and ignore the actual region settings. I have no idea why chromium uses MM/DD on my machine when Firefox does DD/MM. Back when $COMPANY used HTML date widgets we got a small but constant stream of complaints which we tried to triangulate (that's how I know about the Legacy Edge behaviour), but we never understood most cases.

Autodetection has been broken on a tail edge of cases for a long long time, and nobody in browser space seems to have any interest in fixing this - or worse, allowing the server to set the correct date style. The only practical fix is JS datetime widgets.

>or at least not a problem a web-dev should solve.

I think 'a not-insignificant amount of people constantly enters the wrong dates and eventually bothers support and writes bad reviews, plz fix this' is a good business cases and is something web-dev should try to handle.

>> Lets you enter nonexistent dates like 31/2

>server-side validation is needed anyway.

True, but it's a better user experience to disallow this also on client. If we only let the server do validation, why do we even bother with the SPA and the sixty-thousand one-line npm packages?

nelgaard · on Aug 2, 2023

I normally use en_US but I want dates formatted as DD-MM-YYYY (or using dots, slashes etc ) and I want a 24-hour clock.

LC_TIME does not work very well with most apps.

And there is a big difference between just throwing an error if a date-time cannot be parsed because of a nonextent date, and communicating it to the user in a nice way, especially without using JS.

speedgoose · on Aug 2, 2023

But the webdev has to solve this problem. Users with wrong locales and not aware of that are not very uncommon. I would also love the US to fix their stupid date format and even fully adopt the metric standard but sometimes you have to compromise and write code instead.

berkes · on Aug 2, 2023

WRT dates, there's no "metric" standard. Not really. E.g. Belgium commonly uses DD/MM/YYYY whereas the Netherlands uses DD-MM-YYYY. Both use "metric standard" for lengths, weights etc. Same with currencies: "13,37 €" vs "€ 13,37" vs "€13,37", all depending on where in Belgium you are from, vs Dutch in the Netherlands. It's an utter mess.

Which is another reason to let browsers - the user agents - deal with this. There's absolutely no way a lonely JS dev, or even a community around something like MUI to get all this right. And they don't. There's always something broken for me with these custom elements. If it's not some US-centric web-app enforcing their MM/DD/YYYY format, then it's some "ignorant" dev being unaware that in Europe in many countries decimal separators are a comma, or that in Thailand the current year is 2566 and that this is not "too far in the future".

speedgoose · on Aug 2, 2023

My bad, seems like the metric system is an old thing:

https://en.m.wikipedia.org/wiki/Metric_system

robocat · on Aug 2, 2023

> the native mobile version [of datepicker] works great

Strong disagree. It does work for simple forms, but definitely has a variety of quirks on different browsers/devices. Blank dates are especially quirky. Try “tabbing” through a date on iPhone or iPad and have some poor UI. datepicker really doesn’t work well for some less common situations (cut n paste, copy, from/to date, restricted date min/max past/future, year pick, month pick, etcetera).

laurels-marts · on Aug 2, 2023

This is my biggest criticism of all these modern HTML pseudo components. It’s a wonderful idea, truly, but if you don’t provide style hooks to customize and theme them they are useless.

A month ago I wanted to use the input + datalist to have a searchable drop-down but there was no way to control where the dropdown will appear when popped open and what width will it have. Eventually I just gave up. Such a shame.

flagrant_taco · on Aug 2, 2023

100% agree. One big challenge is that we've made browsers and the full list of web specs so complicated that we're likely not going to see any new browser rendering engines competing any time soon.

Ideally we would all take a bit of a step back and throw out old specs that aren't needed and improve the ones that need it, like styling support for built-ins. Unfortunately we're at the whims of Google and Appe though, and I can't imagine they would ever be interested in such a potentially large rewrite to their browsers when it functions as-is

rado · on Aug 2, 2023

<progress>, <dialog>, <details> etc can be themed

jayjader · on Aug 2, 2023

<progress> requires some vendor-specific prefixes last time I tried theming it using CSS (unless you're using "theme" to mean host system/window mananger/browser - wide theme). There is no common subset (that I am aware of) of CSS properties shared amongst browsers that can be leveraged to even decently change the <progress> element's appearance. So I'm not sure that it is the best example.

I agree that many of the list _are_ themable enough to warrant investing the effort to wrangle their particular interfaces over reinventing them entirely with <div>s.

pmontra · on Aug 2, 2023

Yes, it's still a pain compared to more established elements https://css-tricks.com/html5-progress-element/

zelphirkalt · on Aug 2, 2023

I think the idea is rather, that you "extend" by composing primitive elements, and not that you change the primitive elements. Kind of like "composition over inheritance".

Gigachad · on Aug 2, 2023

Take the html select component, you can't extend or compose this in any way. A request so common that it goes without saying is that you should be able to search for items. This is impossible to implement with the default component.

If you want to do another common thing like allow selecting multiple items, this is also impossible. It's not worth starting with the HTML one and then extending later because there is no path to do this. You have to totally scrap the HTML version so you may as well start with a JS library that does everything you need today and everything you will need tomorrow. Which you can theme to fit in with the rest of the app rather than looking like a pimple on a pumpkin that UX and end users spot and complain about instantly.

zelphirkalt · on Aug 2, 2023

I think you misunderstand what I mean by "compose".

You can compose most HTML elements including <select> easily:

    <label>
      What do you like?
      <select name="choice">
        <option value="first">First Value</option>
        <option value="second" selected>Second Value</option>
        <option value="third">Third Value</option>
      </select>
    </label>

There you go, you did composition. The logic between those elements, how doing something with one element affects the other element, that is a different matter.

For some elements it might be invalid HTML if one is inside the other. Like a <div> inside a <span> or so.

Dalewyn · on Aug 2, 2023

>And they stick out like a sore thumb when you get a windows 7 style component in the middle of a modern looking app.

Wouldn't that be a feature?

"Modern looking" as far as I'm concerned means "Can't figure out WTF this bloody thing is." and that assumes I even know there is a thing in the first place.

andai · on Aug 2, 2023

I noticed this helping an elderly neighbor with her banking. "No, you don't click there, but there... you can tell because..." then realized there's literally no way to tell. It's all flat.

Animats · on Aug 2, 2023

then realized there's literally no way to tell. It's all flat.

I hate that. I'm waiting for that fad to be over. I kind of liked material design, but it's too much of a pain to put into everything. Flat, borderless, and unidentified is so easy to do.

The all flat approach encourages dark patterns. Lists of trackers you can opt out of, scrollable, with no scroll bar and no window border. There are important buttons hidden which, if pressed, do things favorable to the user but unfavorable to the site operator.

Also, the pop-up box with no visible dismiss button, just an "x" which appears if you mouse over it.

A non-web example - someone made the console window in Ubuntu borderless. If you have two console windows overlapping, you can't see the boundary.

zelphirkalt · on Aug 2, 2023

There is something nice about plain old <button> default look and feel. It probably is the kind of feedback they give, that does not require me to always roll my eyeballs elsewhere for confirmation, that my click did actually cause some action to start.

Animats · on Aug 2, 2023

Right.

The flat look is borrowed from phones, from UIs where buttons were few and large, and the concept of "mouse-over" is not meaningful. Now it's everywhere, even for complex interfaces where it's not appropriate.

trealira · on Aug 2, 2023

The process pretty much started with Windows 8, which was released in 2012. I don't think the flat style will go away anytime soon. At most, skeuomorphic elements will be slowly phased in. Material design at least adds shadows, and "neumorphism" adds back some 3d popping out, although I haven't seen it much.

oneeyedpigeon · on Aug 2, 2023

Windows 95 was probably the optimal UI—in the Windows world, at least. Meaningful buttons, a reasonably-contained set of components, proper scrollbars. I'm not sure what the next major revision was after that (98 didn't change that much, IIRC) but I bet it was probably a regression.

dleeftink · on Aug 2, 2023

> There are important buttons hidden which, if pressed, do things favorable to the user but unfavorable to the site operator.

Could you elaborate a bit? How is the flat look benefitting users over site owners? (Is this regarding lists of trackers?)

andai · on Aug 5, 2023

I think the idea was that when dealing with UI elements where it can be unclear what even is a UI element, it's easier to shape user behavior. e.g. making it harder for the user to "opt out of all tracking".

Those cookie consent boxes are definitely full of dark patterns. My "favorite" one was one that would take 45+ seconds to save your changes. I sent a complaint to the company that makes the consent box, and they responded "website owner configured it incorrectly, nothing we can do" LOL

renegade-otter · on Aug 2, 2023

I just ran into "datalist" and my first impression was "wow, game changer". The behavior is the same across browsers but the appearance is strictly browser-specific. You can't style it with CSS.

Sometimes the list displays the text of the data, sometimes, the text and the "value" attribute. So you are not selecting "Atlanta" - you are selecting "234290780 Atlanta" (the ID and the value).

And with the on-click action, you can't just get the ID - you have to get the whole thing and parse the ID out.

It just seems... abandoned.

iamflimflam1 · on Aug 2, 2023

That's the problem with most of these things, half assed implementations that just tick the box of "compliant with a standard".

There's a reason why we have all these frameworks built on top of HTML - it's because the browser manufacturers have not done their jobs.

flagrant_taco · on Aug 2, 2023

That's am incentive alignment problem, browser vendors' job is ultimately to make money not improve the specs.

Anyone that's interested can get involved in the specs process though. If anything it's web developers who haven't done our job there, it feels to me good specs are more our concern and responsibility than anyone else's.

lenkite · on Aug 4, 2023

This is also why I think frameworks like Flutter that render everything to canvas are going to become more and more popular over time - as more and more WASM standards are implemented in browsers.

HTML will be used only for docs and not for web-apps.

datagram · on Aug 2, 2023

> You can't style it with CSS.

Unfortunately, I suspect that this is 100% intentional. datalists can draw outside of the browser window, which is fantastic, but also probably means that there are security considerations for letting it be styled by users. Imagine malicious ads/websites being able to draw outside the browser window.

jamamp · on Aug 2, 2023

You might like this: http://youmightnotneedjs.com/

klabb3 · on Aug 2, 2023

Hahaha wow I have to say that was the least convincing demo I’ve seen.

- Fills history with massive amounts of entries, and back button doesn’t do anything

- Slider UI look like crap (ok fine, can be fixed) but use not just css but SCSS (requires a precompiler) but wait, not enough, it also needs hardcoded number of images. It’s not reusable in the most basic sense.

- Input validation has phone number on xxxx-xxx-… format and doesn’t fill dashes automatically. It’s also type=number which opens a numpad on iOS which does not have dashes available at all. I can’t proceed unless copy pasting a dash from somewhere else?

- Gave up after that but I’m sure there’s plenty more

Yeah, I’m leaning towards that JS isn’t the nemesis of accessibility. It’s simply not knowing what works and testing properly. It’s funny that frontend folks are seen as lesser beings and then counter-claims like this is passed as enlightened. Yeah on first glance maybe, but it’s proof that these regurgitated claims are made with very little insight. Like all tech, you have to know how to hold it right which takes a little time and humility to get right.

And yes, we still use too much JS. But it’s not the fault of JS or dev practices that we have newsletter popovers, cookie banners and 100 ad delivery and click tracking requests per page load. Indeed JS became extremely bloated for a while but nowadays everything is equally bloated, just look at all the backend snake oil with 1000 cloud microservices and leftpad-like APIs.

Freedom2 · on Aug 2, 2023

I disagree. I've built entire sites that rival the most popular SPA's today with HTML / CSS and a tiny bit of JS, and I've had less issues with those than other sites at my agency built with a JS framework. User reports often come in telling me "Wow, this site is so good! I'm really glad you took on this project".

lelanthran · on Aug 2, 2023

There's a scroll indicator!

It tells you, by looking at a thin bar, how far down the page you are! What a novel idea!

I wish browsers had this builtin so that we didn't need to implement a bar for showing the user how much of the document is left to scroll.

(Seriously though, wtf did firefox make the scrollbar autohide? In order to see it the user has to interact with the page. It's worse in the debugger, where horizontal scrollbar just won't show until you interact with the keyboard in some way).

bbarnett · on Aug 2, 2023

Because firefox devs are... weird. I don't even mean that detrimentally, just observationally.

You can un-hide that bar permanently, but when you do, it always covers the edge of the webpage.

# scrollbar fixes

user_pref("widget.non-native-theme.scrollbar.style", 1);

user_pref("layout.testing.overlay-scrollbars.always-visible", true);

I literally cannot see the benefit in hiding the scrollbar. It sounds like an edge case made primary.

toyg · on Aug 2, 2023

FF devs have to justify their jobs like any other. Someone was promoted for shipping a fancy-looking feature that some other browser, somewhere, probably has as default.

Still better than the Borg, at least you can fix it.

o1y32 · on Aug 2, 2023

I guess you never used MacOS, where the default system scroll bar behavior is even more interesting.

oneeyedpigeon · on Aug 2, 2023

Yeah, seriously bugs me too. Scrollbars were one of the most powerful, useful UI components out there, and we had to sacrifice them because of mobile, for some reason.

toastal · on Aug 2, 2023

Beware: some of these are effectively hacks possibly messing up the browser history or introducing accessibility issues for screen readers, keyboard users, or otherwise.

o1y32 · on Aug 2, 2023

Exactly.

Level 0 developer: don't know how to do this in JS Level 1 developer: knows how to do things in JS, even when sometimes it should not Level 2 developer: knows how to do things without JS Level 3 developer: knows when to use JS and when to use CSS or something else to achieve the goal

toastal · on Aug 2, 2023

I was more the opposite. I avoid learning JS for as long as possible since it seemed so complicated following the resources I tried to learn.

Ideally we’d have just a few more semantic elements that are obvious common patterns introduced to the HTML spec (like details & dialog were). We’re pretty close now, but I would like to see better accessible no-JS options for building menus.

flagrant_taco · on Aug 2, 2023

Seems like aot of people misunderstand what these examples are best for. Sure you wouldn't want to ship these as-is on most sites, but it shows how much can be done without JS. That can/should make it more clear that you can likely get by with much less JS when you do need to reach for it.

It's not about throwing out JS, it's about avoiding 30kb of JS if all you need is a few summary/detail elements where only one can be opened at a time. Use the example code here then write a small I line script that closes all siblings. Done.

liam_ja · on Aug 2, 2023

You might not need JS, but you will need a SCSS transpiler.

teaearlgraycold · on Aug 2, 2023

A good resource for developers targeting the tor browser set to secure mode.

tannhaeuser · on Aug 2, 2023

> I had heard of almost none of these HTML elements

I'm not disagreeing with the gist of your post, but come on, these elements have been around for ages. It's definitely on you to become acquainted with the basics before your HTML critic can be taken seriously ;)

The post links to MDN (arguably the most useful short reference) but there is of course also WHATWG's HTML spec or, if that's too voluminous, SGML DTD formal grammars for WHATWG HTML 2021 and 2023 snapshots [1], as well as for the older HTML 5.x series.

[1]: https://sgmljs.net/docs/html200129.html

mattmanser · on Aug 2, 2023

Most of them don't work properly and/or look terrible by default in all browsers.

So no-one uses them, so lots of people don't know about them.

geysersam · on Aug 2, 2023

What do you mean with "they don't work properly"? Could you give an example?

TheFlyingFish · on Aug 2, 2023

When I use the datalist element with a text input, Chrome shows an arrow on the right and the list drops down when you focus the input. Firefox, however, shows neither until you start typing, at which point it suggests just those items that match your input. So there's no way to see the full list of options.

I think Chrome's behavior is correct here, but the larger point is that precisely because these are native elements, when they don't work there's nothing you can do. So your only option is to reimplement them from scratch.

iamflimflam1 · on Aug 2, 2023

Datepicker was fundamentally broken in many ways in safari for a long time. And it still doesn't have the functionality that most apps need, so it's pretty pointless.

didntcheck · on Aug 2, 2023

<abbr> has always been interesting to me. I've been aware of it for as long as I can remember, since it's been in all the tutorials I remember reading as a teen, and other docs as I got older. Unlike some other forgotten elements, it's one that's clearly very useful, and yet I can count on one hand the amount of times I've seen it used in the real world (yet plenty of times where someone has reinvented it with JS)

Likewise, image maps. Remember messing about with those when I was young, but Wikipedia is the only place I've seen them used. To be fair, the UX isn't great, and I've often ended up navigating to an article when I was expecting to view the image page instead

pmontra · on Aug 2, 2023

Everybody was using imagemaps in the 90s because it was the only way to have multiple links over an image, something we wanted to do because without CSS we could not have a row or a column of buttons with fancy colors and fonts and placed where we wanted them to be. So nav bars were a large image with imagemap anchors placed over the buttons. Then we got CSS, tables (used for layout!), divs with positioning and eventually the features that web developers are using today.

vanarok · on Aug 2, 2023

https://qutebrowser.org

klardotsh · on Aug 2, 2023

I've used qutebrowser off and on for many, many years. At the end of the day, it's a skin over QtWebEngine, which uses Blink under the hood, and thus contributes to Google's overdominance of the web and the standards that define it, so I try to avoid it, despite it being a better implementation of a Vim layer than, say, Tridactyl for Firefox is (in my opinion).

Beyond that, QtWebEngine is about the polar opposite of the type of engine I described in one key area: resource usage.

noelwelsh · on Aug 2, 2023

Extensibility is the problem here. Either you force everyone to use the a limited set of UI controls (won't happen) or you need to allow some way to create custom UI controls, which leads to JS (or some other programmable system).

EspressoGPT · on Aug 2, 2023

> I had heard of almost none of these HTML elements, and that's such a shame

I guess, that's on you – if you're a web developer, you definitely 100% need to know these elements. They're not new.

cassepipe · on Aug 2, 2023

I really want to believe in the semantic web, I really want to believe in the ability of the browser to provide me with good default modules with a good default styling, but for now I just have to accept this is not the case. The fact that I have to think about labeling a input (why is this not an attribute ?), not being sure if I should use it as a wrapper of as a sibling with the `for=` attribute... and this is just the tip of the iceberg. For each tag, I have to learn the whole history of its development and make an inquiry about what's the right way to do it nowadays.

We could have had nice things.

Ssssshhh, calm down, let go.

flagrant_taco · on Aug 2, 2023

(I don't know what tools you use so this isn't a comment directed at you specifically)

If web developers spent a fraction of the time required to learn react, tailwind, etc on learning HTML the web would be in a much better spot.

There are definitely quirks and rough edges, but if every web devs knew how to get the most out of semantic HTML we'd likely have a lot less JS in the browser, fewer accessibility bugs, and more eyes on when specs could use an update to fix our replace some of these quirks.

berkes · on Aug 2, 2023

> on learning HTML

Anecdote. Was recently freelancing at a web-agency. They build complex web-apps. Lot's of senior and experienced web-devs there: react, mui, typescript, tailwind, and a large host of backend frameworks under the belt.

But when I built a quick PoC using `<template>` a few lines of JS and some of the elements used in the article (meter, dialog, details) they were flabbergasted. This was a whole team of experienced developers doing web-ui development for their job, 40+ hours a week. And they didn't know, not even realized, that HTML had them covered for loads of use-cases.

Edit: I am no frontend-dev, so I have to look up everything anyway. Which is probably why I come across those "new" things easier.

afavour · on Aug 2, 2023

Broadly, broadly broadly I agree with you and I’m endlessly frustrated with the state of React-based frontend dev.

But that said you really do get a lot more out of the box with these frameworks. Tailwind makes consistent styling far far easier. MUI helps with the same and also (often overlooked) has a lot of accessibility features built in.

boredumb · on Aug 2, 2023

I don't think people realize how CSS3 and HTML5 will do everything they need. And also that no one actually enjoys using a SPA.

dimmke · on Aug 2, 2023

CSS3 and HTML5 are not meaningful semantic version numbers anymore and haven’t been for over a decade.

Both CSS and HTML are considered “living standards” now and no longer use version numbers: https://html.spec.whatwg.org/multipage/introduction.html#is-...?

Nesting in CSS became broadly supported in Chromium and other evergreen browsers a few months ago. This is a feature that developers have had to use inside of some kind of tool that compiles down to “regular” CSS since 2006. That’s 17 years. 17 YEARS.

And there was no major announcement when it became supported, just a blog post: https://developer.chrome.com/articles/css-nesting/

Even now, it’s not supported by older versions of iOS/mobile Safari which could easily be 15-20% of a large US based websites’ traffic.

chrismorgan · on Aug 2, 2023

> Nesting in CSS became broadly supported in Chromium and other evergreen browsers a few months ago.

Firefox hasn’t shipped it yet. https://caniuse.com/css-nesting shows it landing in 117 next month.

> Even now, it’s not supported by older versions of iOS/mobile Safari which could easily be 15-20% of a large US based websites’ traffic.

Yeah, actual global support is probably still below ⅔—caniuse.com is showing global support at 72.89%, and its methodology is hopelessly broken for mobile browsers (treats all mobile Chrome/Android WebView as the latest version, which is wildly wrong), quite apart from excluding browsers that block the StatCounter script, leading to particularly heavy undercounting of Firefox and general undercounting of more conservative or unusual configurations; so the true numbers on newish features are always much worse than it suggests.

For these sorts of features, if all browsers ship around the same time, you’ll normally want to wait for about another two years before you start depending on it. (When shipped out of sync, it depends—you’ll encounter two-year-old Safari more commonly than six-month-old desktop Chrome, for example.)

dimmke · on Aug 3, 2023

Thanks, that analysis was worth adding. You're saying 2 years, I've heard other people say they would maybe consider relying on this in 5. It's really dependent on the software you work on.

So an estimate of around 20 years from the time people started using things like variables and nesting to improve writing CSS to being able to actually write real CSS using those features and actually being able to count on broad browser support.

Then I think about things like JavaScript modules, and how completely fucked up that entire ecosystem is for so many reasons.

And then condescending comments in the parent like "I think they just don't understand what can be done with just HTML/CSS these days, they're so used to complexity" - people who have clearly never worked in front-end development as their day job. It's insulting and reductive.

I am glad that this stuff made it into the spec, and I know that over the next 5 years, things like native CSS variables and nesting will become more common until Sass/PostCSS is scarce, but they will probably still be being used inside of abstractions like SPA frameworks. And all of the JavaScript will unfortunately probably be written in TypeScript.

To me, the holy grail end game of front-end development is the elimination of a build step. And for that to happen before I hit retirement we have to find some way of decreasing the lead time of innovation -> spec implementation -> browser support from 20 years to < 5 years. Obviously, browsers becoming evergreen and Internet Explorer finally dying will help. Already, simpler tooling is becoming more prominent than the dark days of Webpack. But there's still a long, long way to go. This is all assuming Google doesn't finally figure out a way to destroy the open web all together before this happens.

clairity · on Aug 2, 2023

i've played with nesting on firefox and, much like the :has() selector, it seems good enough for the 80% case, so i wish they'd unflag it so we could get the clock started on having it be commonplace in a year or two to use it confidently.

tjpnz · on Aug 2, 2023

That would require accepting that the complexity merchants sold them a lie.

syncbehind · on Aug 2, 2023

> complexity merchants

I have never heard of this term before. But I do think it is quite apt. Is this an established concept (can I read more about it somewhere)? Or did you come up with it on the fly?

tjpnz · on Aug 3, 2023

Someone else coined the term and there was a HN post. I cannot find it though.

philistine · on Aug 2, 2023

Part of the slow adoption of ultra modern HTML tags might be that the divorce from Internet Explorer 6 to 10 finished so recently that no one has bothered to update their knowledge yet. In my corporate environment, IE 11 was removed only last year.

berkes · on Aug 5, 2023

But with just a little JS you can have a full SPA. You don't need a virtual dom to manage some HTML tags. You don't need UI libraries to render a UI. You don't need complex state management hook flow whatevs to manage some state. You don't need routing frameworks to read and write the URL. And so on.

With just HTML and CSS you can get a very long way. And if that last mile is truly a requirement, DOM APIs and a few lines of JS (or TS) have it covered. Only when all that grows wieldy do you need npm, frameworks and complex trees of dependencies.

Spivak · on Aug 2, 2023

But it does seem that no one enjoys actually building MPAs and that force is currently winning right now.

boredumb · on Aug 2, 2023

Using your favorite language, a simple http listerner+routing and what ever HTML templating engines are supported, MPAs are simple and enjoyable to build.

I think product people enjoyed pretending they were facebook for a while and decided the world needed more infinitely scrolling SPAs and forced a lot of people into having to use react and other frontend heavy frameworks to try to wrangle all the (often times brittle) javascript involved to juggle client state. I don't think we're better off as users or developers because of this.

berkes · on Aug 5, 2023

It's rather easy to add infinite scrolling to an MPA too, though. A few lines of JS is all you need.

I'd argue this is simpler and easier than investing in a full blown ui library and state manager or painting yourself into some corner of today's JS framework.

treis · on Aug 2, 2023

Covered is stretching it. Most raw HTML elements look terrible and wouldn't pass muster with pretty much anyone.

IMHO, this is a big miss with browsers. Sites look awful without styling and you have to be pretty good with CSS to even make them look passable. Way easier to reach for a framework with prebuilt components

berkes · on Aug 2, 2023

This anecdote was a "pixel perfect" HTML version of some figma design. I did some CSS tricks to style the `<details>` and was lucky the designer was lazy and never specified the styles of all the dialogs around date/time pickers (they weren't that important anyway).

You can style a lot of these native elements. And where you cannot, I'd argue that's actually good. I've worked with designers who insisted that everything looked and feeled the way they had designed it. But I've also worked with designers who, when showed how the date-picker looked on IOS, OSX and even Gnome, were incredibly happy that finally there was design that just followed what the users were used to.

Point being: it will vary. But I'm certain we need all these JS UI-frameworks like MUI far less than we use them. I'm certain plain-old HTML, CSS and a little JS suffices far more often than it's currently used.

treis · on Aug 2, 2023

>You can style a lot of these native elements. And where you cannot, I'd argue that's actually good. I've worked with designers who insisted that everything looked and feeled the way they had designed it. But I've also worked with designers who, when showed how the date-picker looked on IOS, OSX and even Gnome, were incredibly happy that finally there was design that just followed what the users were used to.

The native browser date picker is very limited. You can't do basic things like disable weekends or select a range making it unsuitable for a wide swath of usecases.

>Point being: it will vary. But I'm certain we need all these JS UI-frameworks like MUI far less than we use them. I'm certain plain-old HTML, CSS and a little JS suffices far more often than it's currently used.

These UI frameworks are just plain-old HTML, CSS, and a little JS. All conveniently built for you to easy build a site that looks pretty good and covers most UX needs.

>This anecdote was a "pixel perfect" HTML version of some figma design. I did some CSS tricks to style the `<details>` and was lucky the designer was lazy and never specified the styles of all the dialogs around date/time pickers (they weren't that important anyway).

If you had used MUI you wouldn't have had to do CSS tricks and if it's using a small fraction of MUI then treeshaking will result in a negligible amount of JS/CSS sent over the wire. So no real performance gain, harder for the next engineer, and no great path forward if the UI needs to be snazzier. It's just worse all around than picking one of the well known frameworks.

Freedom2 · on Aug 2, 2023

> The native browser date picker is very limited. You can't do basic things like disable weekends or select a range making it unsuitable for a wide swath of usecases.

I think that you'd have to reevaluate your users and your use case then. As someone, like berkes, who builds sites almost entirely with HTML / CSS, it's often the case that the developer is RIGHT over what the user needs. After speaking to many clients about the limitations of native HTML elements, I've successfully convinced users to change negative browsing patterns.

treis · on Aug 2, 2023

Why should I write my own HTML and CSS over using what Bootstrap gives me? I can't think of a single reason to believe that my HTML/CSS will look or perform better than Bootstrap's.

Especially since you can selectively import only the components you use. You're essentially betting that you can like for like implement what Bootstrap gives you better than what they can. Which just seems like a terrible bet regardless of how good that particular developer may be.

aargh_aargh · on Aug 2, 2023

Look terrible? Didn't we have this debate a couple of decades ago about separation of semantics and presentation? Sorry, I should have let it slide.

kethinov · on Aug 2, 2023

I think what that person meant was if browser default styles made semantic HTML look more beautiful, it would probably reduce the incentive for lazy devs to make div soup.

Like imagine if every browser preloaded a dozen attractive classless CSS frameworks for users and/or devs to choose from sort of like CSS Zen Garden.

If all browsers had that, I think we'd get less div soup.

treis · on Aug 2, 2023

Three things I think:

(1) Unstyled HTML looks terrible.

(2) There's relatively few native components and the ones that exist are limited. Like not even what JQueryUI gave you 15 years ago limited. No cards, accordions, avatars, and other sorts of basic building blocks.

(3) No real support for common page layouts. Like a Dashboard or Hero marketing page sorts of things.

cassepipe · on Aug 2, 2023

That was indeed my point:

Good defaults are the best but when you have bad defaults you might as well go for full-fledged well thought-out third party tools

chrisweekly · on Aug 2, 2023

This is one of the things I like best about Remix (https://remix.run) -- it leverages React in a way that "grounds" it in web fundamentals instead of piling on further layers of abstraction. Remix is refreshingly simple, and its docs are chock full of references to MDN as the more authoritative source for web platform APIs.

VoodooJuJu · on Aug 2, 2023

>senior engineers flabbergasted by basic HTML & JS

These are the people rejecting your job application.