I pretty much agree with all the article. Back in 2001, when Web Standards were on the rise, people used to validate their HTML code. It was shown as a badge of prestige when you had 0 errors. Semantic HTML was a hot topic and most of the developers I worked with were a sort of HTML taliban. But the popularity of frameworks helped developers forget the basics and concentrate more on learning how to get the most out of these projects. If we also sum that I don't see true impact on SEO by having the best HTML I think it's pretty clear semantics and validation isn't relevant anymore.
When people ask me why SEO isn't working. Well, try to see the first results on Google, and check their code. Most of the times you will see worsen HTML validation and semantics trumped over better (not perfect), not mentioning the first page is mostly paid ads. Google is not like back in the 2000s. Things has changed.
I currently serve real XHTML5 code on my website with the correct media type of application/xhtml+xml. https://www.nayuki.io/
This works properly in all modern browsers (Chrome, Firefox, Safari, Edge; PC, Mac, Android, iOS) and even Internet Explorer 11. Though in the past, I had to make concessions for older versions of IE, serving the same code as text/html instead.
I arranged things this way because I hand-write much of the HTML code on the site, and want to catch syntax errors as early as possible without the browser silently (and possibly incorrectly) fixing my mistakes. In any case, this is living proof that XHTML5 works.
The lax nature of the HTML5 doctype did no one any favors. For most elements (`<br />` included) I still use a variant of XHTML Strict. Granted, it wasn't really necessary to have all the URLs and dates there, but to me, the only "correct" way to do this is with `<!DOCTYPE html>`.
The fact that validators don't choke on this is an accident of history.
Opening can of worms: did you serve them with any of the available XML MIME types? If not they worked simply because of the bug in the browsers (SHORTTAG means a different thing in HTML compared to XML) and all XHTML code served with HTML MIME type would be littered with ">" if browser treated it right ;)
I once toyed with the idea of serving XHTML page as application/xhtml+xml, but the fact the browser (back then, not sure about now) will just display XHTML error and doesn't render anything was a deal breaker for me.
Sorry, but you’re mistaken. The parent never brought up IE - he was referring to the fact that when serving XHTML with the correct mime-type (application/xhtml+xml), any error causes the page not to render at all. This behaviour is intended per spec and was the same across all browsers, not due to lack of support. It’s called “draconian error handling”, consequence of being XML, and was a major factor in the death of XHTML.
The death of XHTML was brought on by the then crop of kids who don't want to understand how computers work and want it all done for them with someone else's code cause computer science is too hard and they'd have to think and thinking is too hard.
While I often hear this "draconian error handling" about XML/XHTML by such people who then complain that a language compiler is just as draconian and real programmers complain when it's not and doesn't catch their every little mistake.
It's this lack of education and drive for the pursuit of knowledge and understanding that caused XHTML to fall out of favor and no other reason.
That's all well until you have external content (e.g. blog with comments or blog roll headlines). Many an enthusiastic Web author in the 2000s migrated off XHTML again. Not because getting it right is too hard (each fix for each problem is quite straight-forward and often trivial), but because when you have the choice between having a completely broken page/site until you have time to log on and fix it and having a miniscully broken part of a page that's easily ignored by a visitor, then the idealism was just too bothersome. Unfortunately, that's human nature. Also see: rms's principles as exemplified by his life-style and how many follow his example perfectly.
In XHTML. No browsers treat documents (no matter the doctype) as XHTML unless the proper MIME type is used. Without it your XHTML markup is treated as SGML application (which HTML is) and thus <br /> means completely different thing.
See http://jkorpela.fi/html/empty.html for the details.
> 6. Then, if the element is one of the void elements, or if the element is a foreign element, then there may be a single U+002F SOLIDUS character (/). This character has no effect on void elements, but on foreign elements it marks the start tag as self-closing.
The problem with making the SOLIDUS optional for so-called void elements is that the set of void elements isn't finite across time. A new one could be added in the future, which means any document which relies on implicit syntactic behavior requires an updated parser simply to get the most basic AST.
XML and XHTML formalized a distinction between syntax from semantics, permitting forward compatibility for code, like low-level parsers, only processing the syntax.
The WHATWG made the argument that out in the real world syntactically correct documents are almost the exception, not the norm. Because that's true the vision of being able to ubiquitously slice-and-dice documents with a shared syntax but distinct internal semantics was not attainable as a general matter. Any software consuming HTML out in the open universe would always need to be aware of contemporary HTML semantics even for low-level parsing. The insistence on separating syntax from semantics for HTML had a high cost but very little realized benefit.
However, the benefits are attainable within a closed universe, such as a CMS. And this is why HTML5 doesn't require, but nonetheless permits, XML- and XHTML-compliant syntax. It's not even treated as an error or exception, not in the way that other malformed but recoverable constructs are. A self-closing tag is syntactically valid, so there's absolutely no reason not to use it other than convenience. Excluding it out of convenience is perfectly acceptable, but in some situations--e.g. when using the more general and diverse ecosystems of XML and XSLT processors--it can be extremely inconvenient to exclude the SOLIDUS.
although nowadays there's HTML5, which isn't an SGML application. It allows (but ignores) the closing slash for void elements (like <br>), and specifies error recovery for non-void elements that treats it as if the slash didn't exist.
Because HTML5 is not actually a subset of XML. Every element in XML must be closed, but in HTML5 certain tags (like <hr>, <img>, or <br>) are defined as "void" which means they have no nested content and therefore it's technically improper to close them (although I'd be shocked if any major browsers actually care about that). In other words, an <img> tag does not "open" the definition of a new section of the document the way <div> or <span> do, and it makes no sense to "close" something that was never "open" to begin with.
That's an arbitrary style choice in the spec. Saying "it's void so never use / " is no more natural than saying "it's void so always use /".
Arguably, it's bad style to use the same syntax for opening a tag as for a void tag, because it forces semantics into the syntax for trivial benefit. With out the "/", your HTML syntax parser now has to include a lexicon of all the void tags, and be updated with spec revisions.
When people ask me why SEO isn't working. Well, try to see the first results on Google, and check their code. Most of the times you will see worsen HTML validation and semantics trumped over better (not perfect), not mentioning the first page is mostly paid ads. Google is not like back in the 2000s. Things has changed.