I pretty much agree with all the article. Back in 2001, when Web Standards were ...

tobyhinloopen · on June 26, 2019

All my webpages had the valid xhtml badge!

https://commons.wikimedia.org/wiki/File:Valid_XHTML_1.0.svg

I still have a hard time un-learning ` `

nayuki · on June 26, 2019

I currently serve real XHTML5 code on my website with the correct media type of application/xhtml+xml. https://www.nayuki.io/

This works properly in all modern browsers (Chrome, Firefox, Safari, Edge; PC, Mac, Android, iOS) and even Internet Explorer 11. Though in the past, I had to make concessions for older versions of IE, serving the same code as text/html instead.

I arranged things this way because I hand-write much of the HTML code on the site, and want to catch syntax errors as early as possible without the browser silently (and possibly incorrectly) fixing my mistakes. In any case, this is living proof that XHTML5 works.

the8472 · on June 26, 2019

At least in firefox this isn't handled as XML DOM document (they might still use the XML parser). XHTML pages used to be. On your page:

  document instanceof XMLDocument // false

On https://upload.wikimedia.org/wikipedia/commons/e/e9/SVG-Grun...

  document instanceof XMLDocument // true

chrisfinazzo · on June 26, 2019

Unlearning...why?

The lax nature of the HTML5 doctype did no one any favors. For most elements (` ` included) I still use a variant of XHTML Strict. Granted, it wasn't really necessary to have all the URLs and dates there, but to me, the only "correct" way to do this is with `<!DOCTYPE html>`.

The fact that validators don't choke on this is an accident of history.

rimliu · on June 26, 2019

Opening can of worms: did you serve them with any of the available XML MIME types? If not they worked simply because of the bug in the browsers (SHORTTAG means a different thing in HTML compared to XML) and all XHTML code served with HTML MIME type would be littered with ">" if browser treated it right ;)

innocenat · on June 26, 2019

I once toyed with the idea of serving XHTML page as application/xhtml+xml, but the fact the browser (back then, not sure about now) will just display XHTML error and doesn't render anything was a deal breaker for me.

sureaboutthis · on June 26, 2019

By "the browser" you mean "Internet Explorer". The only browser that couldn't handle XHTML.

ricardobeat · on June 26, 2019

With XHTML any syntax error completely aborts rendering, there is no fallback - the browser will display a native error page instead.

sureaboutthis · on June 26, 2019

Has nothing to do with what I said. IE couldn't handle XHTML until IE9

https://blogs.msdn.microsoft.com/ie/2010/11/01/xhtml-in-ie9/

ricardobeat · on June 27, 2019

Sorry, but you’re mistaken. The parent never brought up IE - he was referring to the fact that when serving XHTML with the correct mime-type (application/xhtml+xml), any error causes the page not to render at all. This behaviour is intended per spec and was the same across all browsers, not due to lack of support. It’s called “draconian error handling”, consequence of being XML, and was a major factor in the death of XHTML.

sureaboutthis · on June 27, 2019

The death of XHTML was brought on by the then crop of kids who don't want to understand how computers work and want it all done for them with someone else's code cause computer science is too hard and they'd have to think and thinking is too hard.

While I often hear this "draconian error handling" about XML/XHTML by such people who then complain that a language compiler is just as draconian and real programmers complain when it's not and doesn't catch their every little mistake.

It's this lack of education and drive for the pursuit of knowledge and understanding that caused XHTML to fall out of favor and no other reason.

bmn__ · on June 28, 2019

That's all well until you have external content (e.g. blog with comments or blog roll headlines). Many an enthusiastic Web author in the 2000s migrated off XHTML again. Not because getting it right is too hard (each fix for each problem is quite straight-forward and often trivial), but because when you have the choice between having a completely broken page/site until you have time to log on and fix it and having a miniscully broken part of a page that's easily ignored by a visitor, then the idealism was just too bothersome. Unfortunately, that's human nature. Also see: rms's principles as exemplified by his life-style and how many follow his example perfectly.

frenchyatwork · on June 26, 2019

br is an empty element (https://www.w3.org/TR/xhtml1/dtds.html#a_dtd_XHTML-1.0-Stric...), so should work fine

rimliu · on June 26, 2019

In XHTML. No browsers treat documents (no matter the doctype) as XHTML unless the proper MIME type is used. Without it your XHTML markup is treated as SGML application (which HTML is) and thus means completely different thing. See http://jkorpela.fi/html/empty.html for the details.

wahern · on June 26, 2019

is syntactically correct HTML5. So is , but that's beside the point.

From https://html.spec.whatwg.org/#start-tags

> 6. Then, if the element is one of the void elements, or if the element is a foreign element, then there may be a single U+002F SOLIDUS character (/). This character has no effect on void elements, but on foreign elements it marks the start tag as self-closing.

From https://html.spec.whatwg.org/#void-elements

> Void elements: area, base, br, col, embed, hr, img, input, link, meta, param, source, track, wbr

The problem with making the SOLIDUS optional for so-called void elements is that the set of void elements isn't finite across time. A new one could be added in the future, which means any document which relies on implicit syntactic behavior requires an updated parser simply to get the most basic AST.

XML and XHTML formalized a distinction between syntax from semantics, permitting forward compatibility for code, like low-level parsers, only processing the syntax.

The WHATWG made the argument that out in the real world syntactically correct documents are almost the exception, not the norm. Because that's true the vision of being able to ubiquitously slice-and-dice documents with a shared syntax but distinct internal semantics was not attainable as a general matter. Any software consuming HTML out in the open universe would always need to be aware of contemporary HTML semantics even for low-level parsing. The insistence on separating syntax from semantics for HTML had a high cost but very little realized benefit.

However, the benefits are attainable within a closed universe, such as a CMS. And this is why HTML5 doesn't require, but nonetheless permits, XML- and XHTML-compliant syntax. It's not even treated as an error or exception, not in the way that other malformed but recoverable constructs are. A self-closing tag is syntactically valid, so there's absolutely no reason not to use it other than convenience. Excluding it out of convenience is perfectly acceptable, but in some situations--e.g. when using the more general and diverse ecosystems of XML and XSLT processors--it can be extremely inconvenient to exclude the SOLIDUS.

detaro · on June 26, 2019

although nowadays there's HTML5, which isn't an SGML application. It allows (but ignores) the closing slash for void elements (like ), and specifies error recovery for non-void elements that treats it as if the slash didn't exist.

lunchables · on June 26, 2019

>I still have a hard time un-learning ` `

Oh boy, I'm supposed to stop doing that? Yikes ... when did that happen?

kminehart · on June 26, 2019

since blocking elements have been a thing, probably.

buboard · on June 26, 2019

> un-learning ` `

wait what ?

joemi · on June 26, 2019

` ` was the correct way in XHTML to do HTML5's ` `

(I think ` ` still works in HTML5 or maybe browsers just don't mind it, but it's not the recommended way.)

aitchnyu · on June 27, 2019

Having an unclosed tag somehow seems wrong. What was the reasoning behind this?

taco_emoji · on June 27, 2019

Because HTML5 is not actually a subset of XML. Every element in XML must be closed, but in HTML5 certain tags (like <hr>, <img>, or ) are defined as "void" which means they have no nested content and therefore it's technically improper to close them (although I'd be shocked if any major browsers actually care about that). In other words, an <img> tag does not "open" the definition of a new section of the document the way <div> or do, and it makes no sense to "close" something that was never "open" to begin with.

papln · on June 27, 2019

That's an arbitrary style choice in the spec. Saying "it's void so never use / " is no more natural than saying "it's void so always use /".

Arguably, it's bad style to use the same syntax for opening a tag as for a void tag, because it forces semantics into the syntax for trivial benefit. With out the "/", your HTML syntax parser now has to include a lexicon of all the void tags, and be updated with spec revisions.

c0vfefe · on June 27, 2019

Perhaps they made it implicitly self-closing.

buboard · on June 26, 2019

thanks i never got the news, i hope browsers dont hate me

DarkStar851 · on June 26, 2019

Haha I've known for years and I still use . Usually the linter will nag me, but otherwise I just don't care enough to fix it.

buboard · on June 26, 2019

i think the feeling of closure i get with the /> is addictive