Hacker News new | past | comments | ask | show | jobs | submit login
The right tag for the job: why you should use semantic HTML (localghost.dev)
184 points by tmfi on June 6, 2021 | hide | past | favorite | 98 comments



For all those saying that ARIA trumps semantic elements, consider that for basic interactions semantic elements come out of the box keyboard accessible.

For example, a <button> will have default tabindex=0 and respond to spacebar key presses, but you'd have to add that yourself if you put role=button on a <div>.

In short, if there is a semantic element that matches your need, use it.


> In short, if there is a semantic element that matches your need, use it.

Yes! This is known as The First Rule of ARIA.


If the browser would make the semantic elements actually look good out of the box a lot of developers would use them by default.


If people could agree on what "look good" means, I don't think it'd be impossible to get the major browsers to update default styles. But every site seems to have a different idea of the look they want.

I'd be curious to see a somewhat-objective look at what's actually wrong with default browser styles, separating out well-established usability and design considerations from personal preferences and branding preferences. I wouldn't be at all surprised if there are near-universal things wrong with the default styles, and those might be possible to change if they can be separated out.


Based on my experience as a web developer working with many designers over some 20 years of design trends, here’s the problem:

Default styles aren’t pretty.

That’s it. They’re superior in every other way. You always know what to click, you always know what an element will do, how it behaves, the UX is consistent with your OS, etc. Default elements are fantastic.


> the UX is consistent with your OS

Buttons in Chrome on macOS look completely different than in native apps. Buttons and context menu in Chrome's native <video> controls look yet again different, following Material Design. From the things one could argue are good about native browser styles "consistency with the OS" is definitely not one of them.


Not datepickers. The datepicker element is garbage.


Too true


Who cares about looking good, when we’re talking about working good?

Of course, it’s worth putting the effort in to both looking good and working good, but if we’re going to pick one, we should go for the second.


I think that is a false dichotomy.


If we're a business then looking good and mostly working ok probably sells better than working good and mostly looking ok.


I feel like this is only true in cases where either your product is your appearance or it's highly unlikely the thing you're selling can be returned.


It would be equal effort to restyle something done with ARIA compared to a semantic element, yes?


Depends on the element, the select and input elements are notoriously difficult to style, which is why people often remake them with divs and JS.

Combo boxes are a fucking nightmare to style, as are checkboxes and radios. Buttons arnt as bad, but you still spend a lot of time fighting against browser defaults which differ across browsers.


Isn't that an extremely moving target? What looks good is very subject to changes of fashion.

Your comment below about being difficult to style is more to the point.


A div looks like nothing by default. It’s probably about the same amount of effort to make a button look good.


You can reset your css


For some elements, yes. Many others have certain attributes that are completely unstylable, a lot of browser-specific attributes and selectors (not just prefixed attributes but also whole pseudoclasses), or both.


Unfortunately, Firefox recently actually stopped making semantic elements look native (=good).


The one exception for me are forms. In that case I would rather use a div or section tag with role=“form”. I have seen less experienced developers do weird things with form tags and submit event and form tags have unique behaviors associated and requirements.


When a form requires JS to be submitted, there’s a high probably that I abandon that website instead. My expectations of working plain HTML are never higher than with regard to forms. By all means, progressively enhance, but don’t be tempted by the hubris of claiming to know better than the user-agent how the user-agent should work. It’s not just rude, it’s a slippery slope towards systematic incompatibility, weirder bugs than the ones you were afraid of, and security holes.


> When a form requires JS to be submitted, there’s a high probably that I abandon that website instead.

You say that, but I doubt that in practice. If the form is your bank login or bill pay or anything else you are locked into I really doubt you will give up because of some personal opinion on JavaScript and instead spend the next 30 minutes calling somebody to complete a similar action over the phone.

The goal is to achieve accessibility as equivalently as possible, with as little enhancement as possible, and yet often not force a page change because of form submission.


Calling someone a liar to their face without checking their body of work first just leaves a lot of egg on yours.


Does that still allow you to press "Enter" in an input field to submit the form, or do you have to build your own kludge to replicate that? If so, that sounds like that would invite only more weird things...


    onkeydown=function(event){if(event.key==="Enter"){mySubmit();}}
You are clearly overthinking the challenge.


You are clearly underestimating the compatibility & accessibility challenges of reinventing the form submission code of a web browser from first principles.

Believing you can do so and not fuck it up is sheer hubris, not to mention, wrong. Doing so because you don't like the browser edge cases, is how websites end up suffering from even more edge cases, and harks back to the bad old days of "Sorry your browser is not supported".

Financial institutions are amongst the worst offenders - ironic since pulling this kind of naïve stunt vastly increases the attack surface. I've been in a position to redesign financial institution systems and rewrite their technology policies, and when doing so this kind of off-standard NIH crap is something I seek to stamp out.

Want to submit a form? Use a damn form, with a plain old submit button, and rely on the standard behaviour. Want to put some UI polish around it? Progressively enhance the standard behaviour.

What not to do: imagine you can write better UI<>protocol interaction code than that already in Chromium and Webkit, or hope to infer and reimplement the behaviours of any more specialised user-agents.


> What not to do

What you said. Most of what you said is common anti patterns. The reason is web technologies are built around a few set of standards that are designed for extension, such as the DOM and WCAG. Fearing those standards for a small set of static principles is common but that doesn’t make it smart. That’s why this technology space is so hostile to originality. You mentioned NIH but the more common problem is: https://en.m.wikipedia.org/wiki/Invented_here

Stop being so afraid and hostile. Embrace how these technologies work and more easily achieve accessibility with less effort. If you are waiting for some tool, NPM package, or framework to do it for you it’s not going to happen.


onkeydown is not a “technology”.

Abusing standards with shoddy inner-platform reimplementations doesn’t make someone an innovator, it makes them as much a fool as the cargo-cult of the NPM ecosystem.


Strange, I always thought the greatest fools are the people who fear the titles they wear.


That is merely a corollary of the Peter Principle.


Your response is a bit harsh.


So you think it's safer to rely on every engineer adding that (and the other facilities provided by web browsers by default when using native elements) to every input field in a form, than relying on them not doing weird things with that one form tag?


If you train your teammates to do their jobs you won’t have anything to be afraid of.


But then why not train them to use <form> correctly? Since you mentioned the reason for not using it was that

> I have seen less experienced developers do weird things with form tags and submit event


That doesn't sound worse than not naming the inputs or coding the buttons for keyboard interaction, things inexperienced developers will do if they're not educated in the basics of semantic HTML (even those who do use inputs too often neglect to associate a <label> where they're called for).


I've never heard of this. Could you elaborate on why you'd not use an actual form element?


Forms cannot be nested. That is an html violation. That is because all submit events fire on all nodes contained by a form on submission even if the page does not go anywhere. Forms also require an action attribute that contains a submission address, which requests a new page on submission.


Not the same thing as nesting forms, but when faced with similar challenges I found the <input form="form-id" /> attribute[0] is a good solution.

The trade-off is that you have to specify it almost every single time, but it lets you group form elements that for UX purposes belong together, but for other reasons need different forms.

[0] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/in...


What circumstances call for nesting form elements,?


If a dev finds a form to be nested, it's usually because something was tacked on as an afterthought. I'm speaking from experience.

I had a project that had applications associated with it. Forms are the natural way to express this. There were some disclosures associated with the application, and there was a modal associated with emailing the disclosures. I needed to add some front end validations to this modal.

No matter what I tried, I could not get these validations to trigger correctly. I spent an entire work day on trying to get a library to work with this modal. It wouldn't work, because changing the modal's div to a form tag would cause errors or unexpected behavior due to the nested form tags.

TL;DR: Nothing requires it. Devs might not understand the implications.


None. There is no good reason to do this. It doesn't stop people from doing it.


working with web forms https://docs.microsoft.com/en-us/aspnet/web-forms/ (Oh the nightmares!)


I think HTML is considered too challenging or 'low level' in modern front end development. Components are created with material UI, or bootstrap. No one seems to be doing it by hand.

I like coding html (and css) by hand. I find it easier to do that than to deal with frameworks and tooling. So I read articles like this and often find something new (TIL about fieldset). But I've yet to see anyone else do this professionally.

So I don't think most people will ever care about this stuff, in the face of modern "best practices". It's just going to be nested div soup, from here on out.


I just went through an ADA compliance audit and we got dinged pretty hard for semantic markup inconsistency. It's certainly become more common in production apps, much like making things SEO optimized before. Keep doing real HTML, it's still important!


I write html by hand as well but I still prefer to use div as general block tag. It's just because I want it be difficult for bots to crawl XD


Bootstrap is just CSS, though? You still have to write your HTML (or JSX) by hand

I haven't seen much to back up your view of things


Misspoke slightly. I'm thinking back to my experience of a bootstrap in react package. The HTML it produced was meaningless.


I don't need to be convinced of the benefits of semantic HTML anymore. It's been made clear in numerous similar articles. What I would like to see are examples that are not trivial news/blog type apps.


The problem is that the semantics of HTML are the semantics of trivial news/blogs. Didn't they say that when HTML5 came out that many of the tag names and concepts were distilled by looking at the frequency of class names used across the internet, and isn't most of the open, indexable internet blogs and forums.

The semantics of the web and the semantics of an application are almost completely different, so those of us trying to ship actual applications over the web are torn between trying to find the closest possible HTML element and customizing it, resulting in non-standard behavior, or just rolling our own div with our own needs.


What sort of apps do you think are difficult to get right, and would make good examples?


I think something like a google maps type app where the results in the sidebar can be a fairly complex set of items would be great to see. Or a typical SaaS app dashboard with cards, a sidebar, charts etc.


Oddly enough, I think semantic HTML might be more important for React developers than anyone else. React developers are very vulnerable to creating copious amount of abstract components due to the freedom composability gives you. Over time, it’s possible different people literally speak different languages (even in the same codebase) because of how they see the structure of tags.

Thinking in Semantic HTML might help normalize the variances.


I've run into some bugs because of composition, something about <a> tags nesting breaking everything, but only in release mode (minified etc.). Not fun to hunt down. I don't remember the details, but I've also seen warnings about other semantic stuff that's enforced (it at least complained about) by the framework.

Technically you could make everything a div or a text element, but most frameworks you use come with certain expectations: the closer you are to using semantic tags, the less likely you'll be to violate those external expectations.

If you're not using any frameworks, I don't see why you wouldn't use semantic tags by default. Sure, you can override the onclick if an anchor with no href and call preventDefault, but why not just use a button instead?


The better control I have over styling the more empowered I feel to use semantic HTML. It was frustrating, years ago, to jump on board wide eyed only to realize that styling is a minefield. I believe this is much improved today.


The demonstration is neat and the advice is sound.

One thing that rarely gets mentioned however is that well structured HTML, including templates and JSX, is easier to develop with.

Semantic tags at least give you common consistency for free. Adhering to some reasonable degree of consistency is the bare minimum we should aim for in respect to code readability.


I wish there was a way to tell the browser to add more default styles to my document. For example we have reader mode but this completely munges your document and kills all custom styles and scripts. It would be cool if you could tell the browser to apply a nice pretty style (customizable by the user of course) but then possibly opt out small sections of the page (like an interactive example) where you could add custom scrips and styles.

Obviously this works best with simple "document" websites but it always seems so weird telling the browser how to style the document when I know very little about the user or the device. It is getting "better" in a way with media queries for information about the device and user preference (like dark/light colour scheme) but it still is relying on each site to interpret those settings themselves, and progress is slow because everything needs to be standardized.

This also opens the door to a "clean" mode with no default borders, margins, padding, sizes, styles which would help when making interfaces where you want to control every pixel and don't want to be surprised in differences in default browser style sheets.


> I wish there was a way to tell the browser to add more default styles to my document.

Non-Chrome browsers still have modifiable user stylesheets (though I think only Safari for Mac still has a GUI option to do it) but more realistically, you can use a browser extension like Stylus to apply your own styles, even defining them on a per-domain basis.

Enabling the user to change the appearance of your site is much easier if you heed the advice of this article and use the appropriate semantic HTML.

> "clean" mode with no default borders, margins, padding, sizes, styles

Web developers have been starting projects with "reset" stylesheets for decades.


> Non-Chrome browsers still have modifiable user stylesheets

This doesn't solve the problem I describe because it breaks websites (unless you are very minimal with your usage). I am talking about an opt-in feature where you concede control to the browser so that it can apply "aggressive" styles such as it would for reader mode. For example changing the current usage-agent style sheet to a "dark theme" is completely infeasible to ship as it would break way too many sites.

Furthermore just being possible isn't enough. It would also need to provide nice defaults to make this a feasible proposition at all. I can't have my blog being barely readable for 99% of viewers.

> even defining them on a per-domain basis.

I also don't want to fix specific websites. I'm talking about something that works by default. (Of course site-specific tweaks are nice, but that is nearly orthogonal to what I am looking for here.)

> Web developers have been starting projects with "reset" stylesheets for decades.

Yes, and this is an unfortunate hack. It would be nice to remove the necessity to remove the need for these and remove the maintenance required (even if the cost of both is small).


>> Web developers have been starting projects with "reset" stylesheets for decades.

> Yes, and this is an unfortunate hack. It would be nice to remove the necessity to remove the need for these and remove the maintenance required (even if the cost of both is small).

There are too many different opinions about what should or shouldn't be in base styles.


That may be true, but I think that is what makes my idea work well.

- For "content" sections the discussion is entirely removed. Each browser can do whatever it wants so there is no need to argue in the standard.

- For "clean" sections nothing! Every element has no margin:0 padding:0 display:block size:1em color:inherit... This leaves no surprises and developers can do whatever they want.

Of course the third section "legacy" is what we have today and largely shouldn't be changed from the mostly-uniform thing that browsers do today.


Instead of using special tags to convey meaning, couldn't we just use special attributes? say, a div tag with an 'aria-role' attribute set to menu would indicate a menu, a div tag with an 'aria-role' attribute set to 'checkbox' would indicate a checkbox etc.

That would make creation of visual elements more versatile, keep the number of hacks to minimum (like the hidden input trick for checkboxes), and also make ARIA possible.


Adding an ARIA role only adds the role, you'd then have to add more to name the element (in place of the missing <label>), to put the element in the tab order, and JavaScript event listeners to handle keyboard input correctly. Doing all that is more complex than using a style reset on an input element and even if you do it all correctly, it's less robust.


Here's Google's test page for "canvas rendering".[1]

Try that with a screen reader.

[1] https://docs.google.com/document/d/1N1XaAI4ZlCUHNWJBXJUBFjxS...


Wow, that's really interesting, thanks for the link. It looks great, but it's kind of a jolt to right-click on a selection and not see the normal browser context menu...


And right-clicking without selecting anything not even popping anything up at all. And hovering over a link not turning the pointer into the link icon. And not being able to <Tab> through links.


I just opened this on my iPad and it feels pretty bad to use.


Most of the heavy lifting in accessibility is in the ARIA attributes and being thoughtful about actually including those. The problem faced by semantic tags is that not all websites use them and that the information that they carry is of very limited use for a screen reader. A section-element without any other information is just a div.

Unfortunately adding semantic information to website has gone from a promising avenue for building websites usable from many different kinds to part of the SEO cat-and-mouse game played by Google et al.

Unless you're doing SEO, everything semantic is just a distraction. That said do take accessibility seriously, it will never be justified by profits but doing the right thing is a feeling no amount of money can buy.


> Most of the heavy lifting in accessibility is in the ARIA attributes and being thoughtful about actually including those

This isn’t true. Semantic HTML is also very important, often more so.

> Unless you're doing SEO, everything semantic is just a distraction.

This is also not true. Besides actively harming accessibility (or making it substantially more difficult to improve), non-semantic HTML also breaks Reader Mode. And it can also add cognitive burden to maintenance and future feature work.


Since reader mode is often "no annoying ads" mode, I'm wondering when sites are going to start breaking it on purpose.


From my understanding, this is part of why reader mode is not standardized or well documented. Its heuristics are definitely revised, likely in part to address this.


Reader mode probably has a pretty low impact on ads. The full page is usually going to load before you activate reader mode, so the only ads it's actually hiding from you are ones below the fold, which are much lower value to begin with.


Not if you use Safari's feature to always load pages on a domain in Reader mode or add a Firefox extension that does the same.

Gizmodo articles should load in Reader mode but don't, I assume they're doing that intentionally.


Yes, but at some point advertisers have to realize that people who use reader mode don't actually see the ads. The ads might load, but the user is just staring at the reader mode button waiting for it to turn on.


> Unless you're doing SEO, everything semantic is just a distraction

Accessibility and SEO are (usually) desirable system effects that rely on meaningful semantic code-inputs to the system.

But that doesn't mean that only software system effects matter, and everything else is a distraction.

Consider that when you're programming your various identifiers -- the names of variables, functions, classes, types that you choose -- are largely without any semantics to your system. To the computer, they're simply keys. And yet we sometimes say that one of two the hard/important problems in computer science (besides cache invalidation) is naming things. Why?

Because the actual system isn't just the software & machine. The system includes the developers groking and shaping the software. And names and their relationship with domain concepts matter to people and help them manage their work.

I've found that the first benefit to semantic markup -- not necessarily the most important, but the first -- is helping me better conceptualize the domain and my code. It functions as an orientation anchor when I'm working with components that look different between their source and rendered as part of a sprawling document in the browser. And it helps me get reoriented faster when I've been away from it for days/weeks/months.

The heart of "semantic" anything is the idea that code is for people too. Maybe not all people, but definitely at least anyone who's writing it.


> "Most of the heavy lifting in accessibility is in the ARIA attributes"

Best practice is use to use HTML semantic tags and only add ARIA attributes if it is relevant. For example, HTML5 has a <nav> tag, therefore it is not necessary to add the ARIA attribute "role=navigation". Even the Accessibility section on MDN (Mozilla Developer Network) states: "developers should prefer using the correct semantic HTML element over using ARIA if such an element exists" (Source: https://developer.mozilla.org/en-US/docs/Web/Accessibility/A...)

> "The problem faced by semantic tags is...the information that they carry is of very limited use for a screen reader."

That's precisely why those semantic HTML tags are important - so that screen readers navigate the page more easily (among other reasons).

> "...semantic is just a distraction"

Accessibility is not difficult if you follow HTML5 semantic markup. Compared to CSS, HTML5 semantic tags are easy. If you use HTML5 semantic tags, accessibility comes built-in - you get it for free.

Where accessibility fails is when you're using a JavasScript framework that generates non-semantic markup. Or you're using a CSS framework and your HTML markup is littered with endless <divs> rather than HTML semantic tags.

I recently posted this video playlist of quick accessibility tips for websites (each tip is just 1 minute). Many websites don't follow these best practices. However, I think people might be surprised by how simple and low-effort it is to incorporate these tips into any website.

Quick accessibility tests: https://www.youtube.com/playlist?list=PLTqm2yVMMUKWTr9XWdW5h...


> That said do take accessibility seriously, it will never be justified by profits but doing the right thing is a feeling no amount of money can buy.

Why can accessibility not be justified by profits? If you sell something, lack of accessibility will cause you to lose sales. How much you will lose depends on the demographics of the potential customers.


Cost of ensuring a site is accessible is greater than the profit from the sales which you get in addition?


Yes but what is the the cost of accessibility? How much more expensive is it to use a <button> compared to a <div> with attached event handlers?


The cost of doing accessibility is the cost of doing accessibility testing. If you're not actually testing your accessibility work, using a button instead of a div is pissing in the wind.


Surely a button is more accessible than a div regardless?


Maybe, maybe not. ("Surely" using the browser's CSS animations would be more efficient than reimplementing them yourself in JavaScript - but if actually you benchmark it, the opposite is true). If you're not actually testing it, how could you possibly know?


If only it were so simple.


For what kind of site? If you just use semantic HTML as this article recommends, the cost is quite low for development, and your QA people should already know the basics of accessibility testing.


An article like this, comes up every few months. They tend to give an intro to the technical how-to but miss the step of convincing the readers that supporting screen readers is desirable.

Ok sure, it'd be _nice_ if blind people could navigate my website. But... how nice? How many screenreaders can I reasonably expect as a % of traffic? What difference will it make to their quality of life? What decision process should I use to decide if _my_ website needs to support this? Does my personal blog to support a screen reader? What about this bank login page? This motivating information is a cruicual step in actually getting people to care enough to do something different.

A contrasting example is the discourse on "bloated" webpages being slow and hard to use on shitty old phones. The upshot is that if your website needs to download and run 1MB of JavaScript then someone viewing it on a 5 year old low cost smartphone on a crappy 3G is going to have a bad time. The statistics around how many people have bad internet motivated me to try and reduce download sizes.

It would also be nice to have a guide on convincing your boss to give you time to support accessibility.

The technical details are good (and this article is quite readable) but probably aren't the key bottleneck in more people writing accessible markup.

<edited for less unnecessary snark>


One of the fallacies around accessibility is 'just for blind people' — there are all sorts of physical and mental disabilities humans suffer from, at various levels of severity (sometimes temporarily, sometimes permanently). It's not just screenreaders—assistive technologies range from grandpa zooming into 8X screensize, to a paraplegic teen using a mix of keyboard nav and trackball.

Accessible design and development improves the lives of more people than you'd expect — and it goes a long way toward guaranteeing better usability for everyone. Gov.uk is a great example to this.

Anyway, if you're working on something with any type of scale, you should be hitting a11y standards as a matter of professional standards. Activist lawsuits in the US are doubling/tripling year over year, and you can expect EU legislation pretty soon too.

If you're just dicking around on a website with minimal traffic, it a matter of personal conscience and intended audience.


Yes badgering developers to do such onerous things like… use a <button> element for buttons.

Rather than ask "why should I support screen readers" maybe ask "why would I actively break them and make my website unusable?"


Writing HTML/CSS is one of the hardest and most frustrating things I've done as a programmer. It sucks. I would actively break my website to get the job done without losing my mind.

I would need to be very _motivated_ to change my behaviour.

I amended my comment to be less snarky I was coming on a bit to strong.


"Just" learn the tools you're using.


Excellent article thanks for sharing


Protip:

Semantic HTML != Semantic Web

Accessibility is important, learn this stuff!


I don't see any reason to use semantic HTML. It's cargo cult.

At some point web designers (at the time in the early 00s web development was not as involved as it has been post Ajax) learned the term "semantic" and went wild with it (made you sound smart, and you could charge a little more than the unsophisticated shop down the road), without understanding the science behind data.

Instead we should have separation of concerns, with structured raw information (you know, like what we have in our relational and document databases), served with different represenations: for accessibility, for data access, for web, for syndication, for print, for reading devices, and so on.

Of course the web being the web, it needed to make this too in the most inefficient and multi-layered way (and leave it up to the web designers/devs to have "proper" semantic tags).


CTRL+F "microdata": no result. Yeah, I thought so too...

The Semantic web has failed. ARIA tags are important for accessibility though, but semantic HTML has been irrelevant for a decade, I haven't seen a single successful product based on that or a single use case that yielded something interesting SEO wise.


Using semantic HTML in web authoring is not the same as the concept of the “semantic web” [0].

The article makes it clear what it is about and it’s not about the goals of the semantic web.

[0] https://en.m.wikipedia.org/wiki/Semantic_Web


The article, and the comments on the article here, and even the ARIA documentation itself as quoted in comments explain why semantic HTML is important to accessibility. "Yeah, but it doesn't yield something interesting SEO-wise" leads us to make web sites that are trash fires, and not just from an accessibility standpoint.


It's very unfortunate that the term "semantic" is overloaded -- the "semantic" web you're referring to which I agree was basically a total failure, but also "semantic" HTML which is used for accessibility, essentially as a shortcut for ARIA tags.

I think it's led to a lot of confusion, especially since it's often not clear which purpose is being referred to in a context.


"semantic" is not overloaded in this case. It has the same meaning in both cases


Semantic HTML != ARIA tags.

Also, isn't the semantic web the reason that we get things like link previews with titles and content and images and... Considering [0] is a SEO website, and it was in the top three for my search for og:title I'd say the problem is rather that SEO infected the whole thing, not that it's not useful to SEO.

[0]: https://seosetups.com/open-graph/


Oddly, I had to use microdata to fix reader mode on my site recently.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: