Hacker News new | past | comments | ask | show | jobs | submit login
Open Letter to XSLT Fans (snoyman.com)
107 points by saurabh on April 7, 2012 | hide | past | favorite | 66 comments



A master of XSLT once explained me that if your XSLT files are longer than 100 lines, you are doing something wrong.

For years I struggled to create 100-line XSLT files; it looked impossible. Then I realised what the problem was: I was trying to convert the source tree into the result tree in one big step. XSLT can do that, but it is not meant to do that and it is just bad software engineering. What I should have done (and I started to do) is to divide the transformation in smaller uniform transformation steps: for example, in one case I ended up with this series of XSTL files, each one taking care of a single step in the transformation pipeline:

* cleaning (remove duplicated and empty nodes);

* completion of partial data (where I go, fetch missing data and use it to make the original data more homogeneous);

* grouping and reordering (twist the data around to make it more similar to the destination format);

* data transformation (where the individual pieces of data are transformed into separate nodes in the destination format);

* layout integration (fill the holes in the layout template with the missing data).

Each file ended up being very small, readable and easily maintainable. The much shorter xpaths used in the templates are a clear indicator of how much better that project has become. Before the split I was doing things like fetching the missing data while putting the already transformed data in the layout template; that was forcing me to transform the newly retrieved data in place before being able to use it. It was a mess. It was a mess because I was not following one of the underlying and unspoken assumption of XSLT. Isn't it the same with any other language? Will an Haskell-based tree transformation language prevent me from writing huge single-file many-purposes-in-a-single-function spaghetti code?

That said, I do not like articles like this that do not explain why they do not like something and do not give examples of what is wrong and how it should be instead.


XML, and surrounding technologies like XPath, XSLT, Schematron, RelaxNG, etc. by extension, are extremely useful as data storage and validation languages. If I was planning to keep structured data around for a long time on disk, and want to make sure that I didn't have garbage stored, I'd definitely want it in XML, with a fairly strict schema to validate that data.

If you're writing an API that works with files (not an over the wire format), and you don't want to be locked into any one language or toolset, XML is pretty hard to beat. Being able to write just one XSLT file, and have it processed by any compliant processor is pretty impressive - no other templateing format works that way.

The problem for most people is that they never touch those other technologies that make XML great, because they're looking at basic data serialization, and don't care so much about validation and transform. Most web technologies fall into this category. Any tool can and will be abused - how many of us have seen a monstrous unmaintainable Excel file that performs a task that would be much better done in a scripting language?

TL;DR: XML tech has it's uses, but it's very frequently abused.


Although I find XML to be much too cumbersome and bloated for any practical situation I can still understand the technical arguments in favor of using it. What I don't understand is why anyone would choose XSLT over a full featured scripting language like PHP, ASP, JSP or similar. As you obviously have much more experience than I in this field can you provide an example where XSLT would be the preferred choice?


http://www.khronos.org/registry/webgl/extensions/OES_texture...

No dynamic server-side templating language, easily transformable to other XML formats (docbook, atom (see front page of registry)). If the user/client has libxml, the processor command is already installed. The specifications are now provided in an application specific format that is language-agnostic instead of manually maintained ad hoc HTML. The extension registry index page is now automatically generated into static HTML instead of manual HTML+JavaScript for sorting the lists.

Finally, the restricted power of XSLT is an advantage for this application vs PHP, ASP, or JSP. If browsers actually implemented web technologies like XSLT, sending XML specs and generating HTML on the client side with cached XSLT would save bandwidth.


How is XML + XSLT better than JSON + JavaScript for generating the view in the browser?

Note: You can generate HTML using JavaScript but it is not needed. You can manipulate the DOM to change what the user sees.


It is easier for crawlers to ingest.


Can you prove this?


If you want store your data in a validateable, language agnostic way, why not apply the same logic to your transformations and templates?

...there are a million oft-argued reasons why not, but that's one line of thinking that will lead you to do it.

These discussion always remind me of my favorite old Slashdot quote: "XML is like violence. If it's not solving your problem, you're not using enough of it."


Well, it's convenient for simple things. You don't need to run a web server, or run a script to generate HTML.


You don't need to run a web server for PHP either...can you give an example of one of these simple things?


Well said -- I think of the same topics every time I hear someone heaping scorn on XML for its wordy serialization output and complex spec (specifically as compared to JSON). All the other tooling around XML is so much richer and more mature... that said, the learning curve for XML and XSLT is certainly much longer and steeper than recent fads like moustache.js.


+1 for XSLT being cross-platform. I recently started rewriting a Perl application in NodeJS. It sends out HTML emails. The HTML is constructed using XSLT. This meant I could just continue using the same template. If I'd written it in something like Template Toolkit I would have had to rewrite it in EJS or something.


From my perspective XSLT is a horrible language. Nevertheless I use it for every of my web projects. Besides the fact that I hate the language I love the thing it does: abstraction. In the backend I only have to care about the information I want to provide and for the frontent I can do whatever I want to do with this information. My largest problems are the verbosity of the language and the incomplete implementations (also xpath). Btw. I think the Excel formulas are much closer to a functional language than XSLT should ever be ;-)


This post starts with a legitimate gripe about XSLT being classified as a "functional" language but then it devolves into a merit-less rant. It makes me wonder what the author uses XSLT for. It sounds to me like he's trying to use it as a general-purpose language. It is not. It's an incredibly single-purpose language for translating one XML document into another, and it makes no pretense of being otherwise. I've been using XSLT in large, multi-file projects for years (so I'm exactly the kind of programmer who "likely knows [he's] right"), and there is absolutely no substitute for what it does.


It's a unique tool for some important tasks.

It's also terribly opaque, and I'm afraid the community isn't very helpful. As a n00b I've worked with Apache, Bind, Python, bash, JavaScript, XSLT. None were harder to learn than XSLT. I've _used_ this tool, and it still takes me hours to figure out how sheets I've written actually work. The documentation is awful -- I have some of the community leading texts right here on my shelf, they are almost worse than useless. Appeals to various fora were as unhelpful as any I've made anywhere.

There's little point complaining about any of that, that community doesn't owe anything to me or anybody else. But after investing weeks learning that tool, I've moved on and use other stuff in places where XSLT should be the answer. And I'm by no means surprised to find other people expressing frustrations about the thing.


> It's an incredibly single-purpose language for translating one XML document into another,

In the same way ColdFusion is "an incredible single-purpose language for translating databases into web pages": it's not, the difference is that XSLT has no alternative whereas CF has a billion.

> and there is absolutely no substitute for what it does.

That's the issue, XSLT is not "incredible", there's just very little alternative.


The parent post does not describe XSLT as "incredible," but rather as "incredibly single-purpose."


Perhaps the lack of a good substitute is why the author is so frustrated. Also, given that he works for a XML document processing company, I don't think he wants to use XSLT as a general purpose language.


I think plain old SAX makes a good substitute. It only lacks one (of the two) good things I like about XSLT, template matching. But Fowler suggests a way to get that: http://martinfowler.com/bliki/MovingAwayFromXslt.html

The other thing I like about XSLT is that it creates a hard code-boundary. But I hardly think its syntax makes that worth it.


Wow, I wasn't aware there were any XSLT "fans" out there. I wonder where the author has been hanging out to find them.


You find them, particularly in places where XML is taken too seriously. Also in places staffed with people who see unmaintainable code as their ticket to job security.

One trouble is that you know there's a cliff out there. There are some simple tasks that can be done with an XSLT that's just beautiful. But try to change what it does, and you reach this point where it becomes incomprehensible.

Back in the day we used to wonder if XSLT was Turing complete -- some guys wrote a paper and proved it, but that's the problem with XSLT. If it takes computer scientists half a decade to figure out if it's Turing complete or not, it's completely incomprehensible.


I am one. But I do a lot of work with XML documents (not just data in XML form, but documents marked up in XML). XSLT is an invaluable tool for this problem domain, but it's a relatively small niche. I can easily imagine programmers getting frustrated with it if they aren't using it for its intended purpose, or even if they haven't yet grokked how it's supposed to work.

It's probably not quite a functional language by most definitions, but it has a lot of features common to functional languages, including immutable variables, no side effects, XSLT 2.0 does have first class functions, recursion, etc. The Saxon implementation of XSLT 2.0 even has lazy sequences (I'm not sure if that's a language thing, or just that implementation).

Without knowing what the OP is having to deal with, I can't really evaluate his complaint. I've certainly seen some hideous XSLT, but it doesn't have to be hideous, and it's great for the (pretty narrow) domain it was intended for.

You can abuse any programming language—I don't think XSLT is particularly special in that regard.


The BBC used to use XSLT a lot for templating web pages, and had a small cluster of XSLT adherents working for them. I believe they've mostly switched to PHP templates now.


"I believe they've mostly switched to PHP templates now." Oh my god make me unknow this.

(XSLT does actually have a standard, syntax checking and built-in context-sensitive quoting)


Reminded me of https://xkcd.com/224/: "honestly, we hacked most of it together with perl"


> My god, it's full of car's!

lmao. Thanks, hadn't seen that one.


There are, I have met a few. Not one myself, but I do like the fact that it escapes properly, which most people working on templating solutions havent bothered to do.


I know two.


Weird, I nominated XSLT as my "most hated language" in the "What's your most hated language" poll a few days ago and got a few down votes. Now, this rant against XSLT is on the front page... Go figure!


In a world where RPG/III, RPG/400, RPG/IV, etc. exist, how can XSLT be anyone's "most hated language?" :-)

All joking aside, I like XSLT just fine, as long as it's use is limited to exactly what it's good at: transforming XML documents from one schema to another. Yes, it's verbose if you write it by hand, and yes it would be nice to have something with all the power of XSLT and with a lighter syntax, but XSLT is hardly the worst thing around.


I think it's probably the worst language I've had the misfortune of having to write anything in. Certainly it was the one I enjoyed the least (even slightly less than Fortran 90 which was somewhat of an achievement), and I think it wins my "most hated language" because of the intersection of awfulness and the nonzero chance that I'd have to program in it. While there are almost certainly worse languages - COBOL certainly appears to be one - there's effectively zero chance that I'll ever write anything in them, hence they're more of an amusing curiosity than something I can hate. Whereas I've had to spend interminable days on XSLT in the past and unfortunately I can't say for sure that I'll never have to again.


Your only argument is "I had to use it and I hated it". Hardly an argument.


We're talking about hate. Do they need an argument? Considering we're talking about hypothetical answers to an opinion poll, I think not.


>We're talking about hate. Do they need an argument?

Blind hate is hardly something desirable. I'd say, yes, even hate needs an argument.


I would at least like to claim that it wasn't blind hate; I have used XSLT before so it's partially sighted at least. And really, it's tongue in cheek. XSLT is a programming language, "hate" for such a thing only really goes as far as "I don't like working with it and hope I don't have to again".


Right. For our purposes, you no more need an argument to hate XSLT than you need one to hate chocolate. You tried it and didn't like it. Case closed. ;)


> it would be nice to have something with all the power of XSLT and with a lighter syntax

That thing you are looking for is XQuery (http://www.w3.org/TR/xquery/) and it is fantastic.

Don't be fooled by the name: it can produce content, too.


Yeah, I'm a little behind on XQuery, as I haven't been doing a lot of XML related work for the past couple of years, but I'm just now starting to play in that world again. I have some some reading up on XQuery recently and it's looking like neat stuff. I didn't realize that it was in any way a potential substitute for XSLT though. Guess I have some more studying to do...


to be honest I wasn't even aware that something like XSLT existed up until now, when I saw articles referring to it I assumed it was a XML path descriptor like XPath

edit: now that I've looked at it I remember seeing it before. I thought it was some crazy custom language invented by the authors of a script and it sent me running. Good to know that that's XSLT


This guy is hilarious:

"Oh, and the fact that you can call a language functional when it lacks first class functions makes my eye twitch. I'm tempted to upload a video of my eye twitching just to prove it."

Anyway, he's the initiator of Yesod, a screaming fast, type-safe web framework in Haskell. Some one called him the 'type safe version of DHH' last month :)


XSLT is miserable, but isn't this old news? My issue is that most real-world transformations require look-up tables, calls to other systems for derived data, etc. The XSLT extension format is no fun at all. Sadly, I have yet to have seen a good, general-purpose framework-y library for sort-of declarative transformations. I suspect that, because most transformation code exists to achieve interoperability between systems, the problems at hand involve impedance mis-matches which are inherently yucky problems.

Anyone seen any good schemes?


> Anyone seen any good schemes?

I've been meaning to look at HXT, I've read it had something like that, but I have not needed to transform XML in a long time so it's fallen by the wayside. On the other hand, TFAA qualifies himself of "Haskell programmer" and does not use HXT so maybe it's not that good.

An alternative I've thought about (but not implemented on grounds of having absolutely no need for it these days, as noted above) is implementing what I consider the good part of XSLT (tree transformation via template matching through XPath selectors) in Python on top of lxml. Something akin to Flask, where the app would be a group of templates, and the routing would be a sequence of XPath assertions (instead of http PATHs + methods). Along with a few helper functions or methods (to easily recurse into the rest of the tree), this ought materialize most of XSLT's strengths in a general-purpose language (making extensibility trivial), and template groups would improve modularity significantly.


If you ever want to push that forward, hit my email (in profile) and I'll try to help.

But I have to wonder if Python is the right tool for the problem. I get the sense that the XSLT transform engines are deployed to handle really big documents, and I wonder if a Python based tool could compete on speed with xalan or saxon.


Most of my work has been in Java, so I can't speak authoritatively outside of its ecosystem. When transforming XML, I've found I almost always need to write tree-walking/visitor-ish code and end-up using a tool like XMLBeans to provide a more literate interface to the source and target documents. When I say "literate", I mean that, instead of writing sourceDocument.getElement("foo").getElement("bar"), I can just say sourceDocument.getFoo().getBar(). XML schemas certainly are helpful in that they enable a bunch of tools like XMLBeans to generate language-friendly abstractions. Often, I've written my own schemas for sources or targets which only had implicitly-specified schemas (via documentation, examples, or simply observation of actual messages).

The reason I prefer to work in non-declarative code land is that I usually must inject many service references into the translators. When converting an industry standard XML format into a company's internal domain model for quoting insurance policies, I had to employ a set of heuristics to create a valid policy from a set of coverage requirements which likely were ill-specified. For example, we didn't offer a $750 auto deductible. Should this be converted (with a note attached) to $500 or $1000? This decision varied by state, policy type, etc. We had a metamodel which I injected into the transformers at the points where such decisions were made.

That the source and/or target of a transformation are XML is a red herring, though. Most of my time is not spent on XML-ness itself but on solving fundamental impedance mismatch issues when converting between two different domain models, sets of assumptions, etc. Document formats don't matter for these problems although those formats with better surrounding toolsets certainly allow one to concentrate immediately on the part of the problem which is the hardest. I even prefer talking about these problems using terms like "model conversion" instead of "document translation" -- too many marketing folks have convinced IT managers that, through the magic of their overpriced ETL tools, "document conversion" problems are a trivial drag & drop matter.

One way I've thought of architecting these model transformations is through the invention of a few intermediate model definitions, each one becoming less source-like and more target-like. I think some stages of conversion are more compatible with declarative approaches. Perhaps attempting conversion in only a single pass has led me to throw the baby out with the bathwater w.r.t. declarative schemes?

For those of you who don't do corporate IT development, the sad reality is that a huge percentage of development effort is spent on data conversion/translation between systems. The ratio of glue:substance is highly skewed toward glue. Furthermore, the ratio gets worse as short-term benefits are prioritized over long-term ones, development is silo-ed between business groups, and data modeling takes a backseat to gettin' stuff "done".


You could call it "xslt the good parts".


Pretty much.


Some time ago, I thought XSLT was quite interesting and potentially powerful with regards to its browser integration. XSLT, to me, represented the first attempt at truly separating data from layout. Even to this day, with all the 'div' based layouts and such, nothing completely and totally separates the data and view tiers in a web layout quite the way that XML with an XSLT stylesheet did.

Early browser bugs probably caused more harm than good for XSLT adoption. XSLT was available in browsers almost from the beginning, more powerful (at the time) than any other client side (javascript) solution. Too bad it didn't get off the ground some more, it might have actually changed the way we work today.

Yes, it's too verbose, and probably doesn't deserve the title of "programming language." But, it's a pretty powerful idea if kept in the scope for what it was invented; transforming one schema to another, with options for transformations occurring on the client or server side.


Back 8 or 9 years ago I architected and worked on a team to write a very large and feature-rich application that depended on XSLT for a ton of UI-deriving functionality.

I found it extremely powerful.

I have never used it again, because the power it afforded me was not worth the heartache.


Just prior to .NET I worked on a classic ASP project that did this - the view models were basically SQL queries with 'FOR XML AUTO' and the 'views' were XSLT stylesheets.

For getting simple data up on a web page it was fantastic. But for more complex stuff it got wickedly complicated to maintain at an exponential rate. XSLT just didn't scale (and/or we just weren't good enough at it to make it scale).


I still have to work on a classic ASP thing that does this. It's taken 5 years to get 50% of it over to .Net.

PAIN. That's what it is. Just PAIN.


The spec of XSLT2 was actually going to work this angle and move XSLT more into a space of a general purpose language. However, lots of people in the XSLT community realized that XSLT is made to do one thing and that's XML tranformation. XSLT(1) did that pretty well to a certain extend. The spec for XSLT2 never really took off and I don't think there are notable implementations and support out there other than from the spec's leading author's (Michael Kay) and his tool Saxon.


I started using XSLT about a year ago and had exactly the same initial reaction - it looks ugly and verbose, it's difficult to debug, and badly written code is almost impossible to understand.

A (difficult!) year of daily use behind me, I've found that all the above problems disappear. People come to XSLT expecting a procedural DSL for modifying an XML file which they can pick up in a day or two, and very quickly their code spirals out of control as they try to twist XSLT into what they expect. It very clearly isn't a procedural language, which is why I think it gets labelled as functional, because it doesn't accurately fit either model.

Once you develop good code style and know the pitfalls to avoid, it's perfectly pleasant to use. It's also significantly easier to get along with if you use a full blown XSLT IDE like oXygen, which allows you to breakpoint, see the call stack, and trace back which template generated a specific part of the output.


It is important that a developer is able to capitalize on each programing language. XSLT is not a language for writing a kernel, it is only to make transformations between XML files (more or less). If the writer of the "Open Letter to XSLT Fans" is frustrated by your current job, maybe he has to find another job. Is the equivalent that a ANSI C programmer critize the language because all days are fixing bugs related to the pointers. PLEASE, BE COHERENT!!!


Without going onto the merits or otherwise of XSLT, there is another observation...

For some reason, even on teams with good engineering practice, all discipline goes out of the window when people write XSLT. So a dev who might write beautiful maintainable code in any other language suddenly reverts to the worst spaghetti style on XSLT with 1000 line functions, impenetrable naming, nary a comment to be seen.

I know. I've been that dev.


Is that because most people can "see" the solution in most programming languages, but most people can't see solutions in XSLT except by experimenting?


There may be some of that, especially at the beginning, but with a little experience that phase seems to pass, it helps to be working in an environment with a good XSLT debugger of course...

I think the fundamental issue may be more to do with a flawed distinction between read "code" and just a "stylesheet / data transform"


Dear Michael, please don't insist on calling program languages bastard childs.

Sincerly, someone who understood that programming lanugages are tools.


I used to use XSLT for a lot of things. I switched off it because it's just too damn slow and can be a total brain melt to do complicated stuff in that's fairly easy in just about any other programming language. The few things I liked about XSLT can be done fairly easily with Scala's pattern matching.


I've used XSLT in the past, mainly to do client side transformations, so send the browser XML, with a XML stylesheet tag and have the browser use the XSL to do the transformation.

At the time the only browser that didn't support it correctly was IE, but after some small work arounds it worked pretty well. The downside being that inserting ad networks didn't work because there is no document to .write() to being XML, and the fact that you had to host a valid DTD and your document would get verified against it each and every time.

The other big issue was that you have to write a lot of boiler plate code to get it to do what you want it to do, with a lot of recursion and the like.


I remember when I found XSLT. I fell in love with it... until I started using it in real world situations. Unless I wanted to pass page content with paragraph or line break tags already added I had to use a 50+ line recursive nightmare.

I do not miss it.


Many higher-education open source software utilize a Java / XSLT stack (e.g. JA-SIG uPortal, Sakai) and I agree that XSL can be quite tedious and verbose. It's definitely the type of language that, if you work with it just "once in while", is very frustrating. If an XSL file it's not broken down into manageable templates it becomes very hard to follow. If you're not so lucky and have to deal with extremely large XSL files, it's easier to just give in and purchase one of the available XML/XSL IDE's. There are a few good ones and they do save a lot of time. No need to struggle through it with a plain text editor.


i don't get the anger or argument. xslt is a templating language, nothing more, nothing less.

it is awesome as a cross platform solution for transforming xml into some other stuff. for anything else - why on earth would you use it?!

i really enjoy playing with it, generating complex html out of xml. things like grouping with the muenchian method are the defintion of nerdy fun for me. to this day i have no idea what went on in steve muench's head to come up with that small snippet of code. i just love him to death for it. i pretty much build a career on top of it, as a lot of other coders never groked it.


It's only programming language I know and is completely useless (since there are better choices) and totaly enoying. I see I'm not alone. :D


I feel that "completely useless" is a bit harsh. For one thing, it is damn near ubiquitous (when one has a system that is already trafficking in XML).


I think I can create much easier to create and much more readable code in any flexible language with nice XML support like perl or PHP. I think it'd be more powerful since as far as I remember XSL lacks some stuff.


I'm supposed to not call XSLT functional because some dude just called me a bastard for being an XSLT fan? Yea, screw you to buddy, I hope you eye keeps twitching.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: