"1024 large-image.jpg 512 small-image.jpg image.jpg fallback My Image"
That is what your code would look like to browsers that didn't know about the new elements. HTML is defined such that browsers can ignore unknown elements for compatibility and still display the text. Using contents for the metadata means that browsers need to know about the elements to at least hide the text.
not practical since you'd have to define attributes for every conceivable size in the spec and that's just asking for trouble. e.g. w2048, h1024, w320, w240,h320, wPleaseShootMe :)
But now it's a PITA properly handle and escape for any toolset that don't have good xml support. Imagine people starting to put <![CDATA[ ]] blocks into this.
JsonML is pretty efficient when auto-generated from HTML source. I use it as an intermediate form for client side templates ( http://duelengine.org ) but I don't write it by hand. Its regularity makes it a perfect candidate for output from a parser/compiler.
JSON is great as an interchange format, but there are many reasons editing it by hand is painful, lack of comments and lack of newlines in strings not being the least of them.