Hacker News new | past | comments | ask | show | jobs | submit login

> How often do transforms need to reach all the way in like that?

In my experience, almost every time XSLT is used on real-world documents, those are documents with multiple namespaces. XSLT stylesheets themselves are also documents that have multiple namespaces. Example: Atom feeds often contain XHTML content. It is a common problem with RSS that it does not specify if the content of an element is HTML or plain text.

I have found that arguments that doubt a feature is necassary from people who can not imagine use cases are almost invariably wrong, while arguments that doubt a feature is necessary from people who list use cases and why they think those are better solved otherwise or even left unsolved are often right. Your post seems like an example of the former; would you say that complex real-world content with namespaces could sway you in favor of them?




I would be convinced if I saw real-world examples where having namespaces gave an advantage over not having namespaces. I can see the value in specifying whether the content of a given node is XHTML or text. I can at least theoretically see value in allowing nesting XHTML without a layer of escaping. I can't see any non-theoretical way in which namespaces are necessary to accomplish these things.


Example: The XSLT stylesheet for this Atom feed generates a web page for each entry: http://news.dieweltistgarnichtso.net/notes/index.xml In this setup, the Atom XML for each entry is generated from XHTML with XSLT, which makes it possible to automatically include an Atom enclosure element for every XHTML media element. To publish a podcast episode, it is enough to add a post with an <audio> or <video> element, as an XSLT stylesheet can “reach into” the XHTML content.

Namespaces are also widely used in SVG, which uses the XLink specification for hyperlinks and can embed XHTML and MathML content. Since SVG can be embedded in (X)HTML, this means you can have an ATOM feed containing XHTML containing MathML and SVG that contains XHTML and all have it displayed correctly.


> Example: The XSLT stylesheet for this Atom feed generates a web page for each entry: http://news.dieweltistgarnichtso.net/notes/index.xml

> In this setup, the Atom XML for each entry is generated from XHTML with XSLT, which makes it possible to automatically include an Atom enclosure element for every XHTML media element. To publish a podcast episode, it is enough to add a post with an <audio> or <video> element.

Sure. Why do you need namespaces to do that? Why couldn't you do it in XML-without-namespaces (or even JSON and some theoretical JSON-transformation-lanugage?)

> Namespaces are also widely used in SVG, which uses the XLink specification for hyperlinks and can embed XHTML and MathML content.

Again, why are namespaces necessary though? Why not just have a tag whose content is specified to be XHTML/MathML ? Wouldn't you want that anyway for the sake of human readability?


XML without namespaces does not exist. If it existed, how would you differentiate between title and link elements in Atom and title and link elements in XHTML? They have the same element names, but do not have the same meaning and therefore must be processed differently. Namespaces ensure that any XML processor can know the language of each part of the input.

Namespaces actually are the general mechanism with which you can specify that content is in another language: If you look at the feed source code, you can see that XHTML content is started with <div xmlns="http://www.w3.org/1999/xhtml"> and ends where that div element is closed.

Having an element with the semantics that “this content is in another language” is done out of necessity in HTML, as it has no namespacing: <style> elements contain CSS, <script> elements contain JavaScript, <svg> elements contain SVG … having an element in each language to embed each other language would become complicated very fast.


> XML without namespaces does not exist. If it existed, how would you differentiate between title and link elements in Atom and title and link elements in XHTML?

By where it is in the structure. The document is a tree where each element has well-defined context; there should never be confusion about whether a particular <title> is part of the feed or part of the content in the feed, because if it's in content it will be inside the content tag.

(Don't you need to do that anyway? I mean what if the XHTML had another Atom feed embedded in it? Or the content of one of the entries in the feed was another Atom feed? That's legitimate, but you wouldn't want to show titles from the "inner" feed as titles in the feed).

> Having an element with the semantics that “this content is in another language” is done out of necessity in HTML, as it has no namespacing: <style> elements contain CSS, <script> elements contain JavaScript, <svg> elements contain SVG … having an element in each language to embed each other language would become complicated very fast.

Only if you need the ability to embed an arbitrary other language. And if you do need that you can't possibly be validating or transforming based on what's embedded, so what value is the namespacing of it giving you?


You may have incomplete documents (e.g. documents with conditional sections, very much like XSLT):

    <code:if test="...">
      <!-- whatever -->
    <code:else>
      <!-- whatever -->
    </code:if>
Here you'll first process your code part an copy the contents as they are and then process the contents; but in the source document the two languages are interspersed.

Or you may want to extend your text format with, say, literate programming and add code fragments and files. In my homegrown system it's like that:

    <literate:fragment id="..." language="...">
      <text:caption>...</text:caption>
      <literate:code>...</literate:code>
    </literate:fragment>
My text system already has a notion of captions so there's no need to add my own "literate:caption" here. Yet the other two "literate" elements are new an unique. Also, using a namespace here ensures that I'm sure not to have a clash if the base system adds their own "fragment" or "code" blocks.


OK, I guess that takes things a level up. I don't like that kind of interspersed style and I don't think incomplete documents should be the same kind of thing as complete ones (e.g. one can't meaningfully validate your first example, because what if the "whatever" is an element that has to be present exactly once). But I can see that if you want to write things this way then namespaces help.


“I don't like” seems to be an æsthetic argument, not a technical one.


> The document is a tree where each element has well-defined context; there should never be confusion about whether a particular <title> is part of the feed or part of the content in the feed, because if it's in content it will be inside the content tag.

In this specific case, maybe – but generally, it is not true that you can infer the namespace of an element from context. Also, elements can have multiple attributes with different namespaces (and often do).

> I mean what if the XHTML had another Atom feed embedded in it? Or the content of one of the entries in the feed was another Atom feed? That's legitimate, but you wouldn't want to show titles from the "inner" feed as titles in the feed

That actually appears to be a bug in my stylesheet. Thank you for bringing it to my attention!

Programs often use namespaces to provide metadata. Here is an SVG I created with Inkscape that uses six different namespaces for metadata: http://daten.dieweltistgarnichtso.net/pics/icons/minetest/mi... Thanks to namespacing, web browsers can display the picture while ignoring Inkscape-specific data.

> Only if you need the ability to embed an arbitrary other language. And if you do need that you can't possibly be validating or transforming based on what's embedded, so what value is the namespacing of it giving you?

It is very useful to embed any arbitrary language, as XML processors can preserve the content they do not understand without processing it. My XSLT stylesheet would have no issue with SVG embedded in XHTML, just as your web browser most likely ignores everything about the SVG linked above it can not understand.


> It is very useful to embed any arbitrary language, as XML processors can preserve the content they do not understand without processing it. My XSLT stylesheet would have no issue with SVG embedded in XHTML, just as your web browser most likely ignores everything about the SVG linked above it can not understand.

Sure, but you can ignore extra attributes in JSON or hypothetical XML-without-namespacing too. I feel like there's an excluded middle here: either the content of a given tag has to be, say, SVG, in which case the validation schema for the outer document could just say (in a structured way) "the content of this tag must be a valid SVG document according to the SVG schema", or the content is some opaque arbitrary XML document, in which case there's no meaningful validation to be done.

Even when working with something like XHTML-with-embedded-SVG, I found myself wishing there was a way to strip the namespaces, run my xpath queries / xslt transformations on the stripped version, and then put the namespaces back; I think I'd've got my actual business tasks done a lot quicker that way.


Ignoring other attributes in data formats without namespaces is not as easy. What if one language is embedded in another and each one has a title element?

I do not know why you “feel” that way about the middle you want to exclude. It has been proven to be very useful in practice for me. Also without it, XML would not have the “extensible” property.

The way you describe working with “XHTML-with-embedded-SVG” reads to me like there is something about namespaces or your toolchain that you have difficulties with. I found that with XML-based systems, especially XSLT, it is easy to make a task needlessly complicated if one does not understand the details.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: