Hacker News new | past | comments | ask | show | jobs | submit login
When XML Beats JSON: UI Layouts (instawork.com)
261 points by marcobambini on Oct 30, 2019 | hide | past | favorite | 266 comments



At this point I am convinced that the decision to define UI layouts in XML instead of code has been a terrible mistake.

- For starters, it's overly verbose. If you've worked with stuff like XAML in WPF or XML layouts on Android, you know how quickly those files tend to get bloated.

- It does not lend itself well to reuse compared to, well, actual code. Which means there tends to be a lot of repetition. Which leads to more bloat.

- Editing XML kind of sucks. You're more at risk of making silly typos that won't be caught at compile time.

- XML namespaces make an already verbose language even more verbose, not to mention just generally confusing.

- At some point you will need to access your user interface from your application code. Which means you end up with silly stuff like findViewById() to bridge the gap, adding even more boilerplate to your code.

- You essentially have to learn two different sets of APIs which do the same thing. <Button title="blah"/> vs button.setTitle("blah"). Why?

The only real strengths of XML are that 1) it's declarative, 2) it's hierarchical, and 3) it's diffable. But code can be too, without all of these other drawbacks.

Fortunately it seems like the industry is starting to figure this out with things like SwiftUI and Jetpack Compose. I really think we're going to look back in a few years and think it was ridiculous that we used to do this stuff in XML.


I don't agree very much with your take.

- User interfaces are very prone to have big files, because there tend to be a lot of different components. I don't get how XML will add more bloat, other than having the <> tags for defining elements.

- XAML lets you define custom controls. I'm not familiar with Android XML, but in any case it's not an issue of the language but of the framework.

- Again, a fault of the framework. Visual Studio, for example, checks your XAML for errors, both when writing and when compiling.

- I don't think namespaces are confusing (most programming languages have some concept of namespaces), but I agree that one could have more facilities to avoid specifying namespaces everywhere.

- Yes, but most of the time you shouldn't access the user interface from the application code, as coupling them tends to create ugly code that is prone to mistakes and failures. For me, this is an actual advantage of XML for UI, it makes it more difficult to create an interface out of spaghetti code.

- Well, one is declarative and the other one imperative. As long as you want to have both types, it doesn't matter what language you use.

For me, there's not that much of a difference between SwiftUI/Jetpack Compose and equivalent XML code. UI definitions are going to be messy no matter what.


> - User interfaces are very prone to have big files, because there tend to be a lot of different components. I don't get how XML will add more bloat, other than having the <> tags for defining elements.

I disagree that this is an inherent property of user interfaces. In fact I think it's a bit weird that we focus so much on abstractions and when it comes to the code we write (DRY principle, etc), but then when it comes to XML we wave our hands in the air and declare "well that's just the way it has to be".

> - XAML lets you define custom controls. I'm not familiar with Android XML, but in any case it's not an issue of the language but of the framework.

Custom controls allow you to reuse entire widgets, but what about code reuse?

> - Again, a fault of the framework. Visual Studio, for example, checks your XAML for errors, both when writing and when compiling.

No, ultimately it's the fault of the fact that you're trying to shoehorn a bunch of stuff into XML attributes (as strings) that weren't strings to begin with. As I said below, ever make a typo in a binding?

> - I don't think namespaces are confusing (most programming languages have some concept of namespaces), but I agree that one could have more facilities to avoid specifying namespaces everywhere.

This is admittedly worse on Android where almost everything needs to be prefixed with android:, except stuff that comes from the support library, plus these other handfuls of exceptions (which is definitely a framework problem), but even stuff like x:Name vs Name is unnecessarily confusing in WPF.

> - Yes, but most of the time you shouldn't access the user interface from the application code, as coupling them tends to create ugly code that is prone to mistakes and failures. For me, this is an actual advantage of XML for UI, it makes it more difficult to create an interface out of spaghetti code.

That sounds good in theory. In practice, you end up with a lot of code-like constructs in your user interface. Every XAML file I've ever seen is full of bindings, converters and converter parameters, triggers, etc. You might insist this is not code because it lives in an XML file but it sure looks a lot like code to me, just shoehorned into a document language format.

In other words, while we strive hard to avoid spaghetti code, it seem like we don't apply the same principles to avoid spaghetti XML.

> - Well, one is declarative and the other one imperative. As long as you want to have both types, it doesn't matter what language you use.

False dichotomy, IMO. You can build a declarative UI in your code without resorting to XML.

> For me, there's not that much of a difference between SwiftUI/Jetpack Compose and equivalent XML code. UI definitions are going to be messy no matter what.

Have you actually worked with either of these frameworks?


Namespace prefixes in XML can be a) shortened and b) removed by setting the default namespace. I.e. the following are equivalent:

    <foo:bar xmlns:foo="urn:foo-corp">
      <foo:baz />
    </foo:bar>
    <f:bar xmlns:f="urn:foo-corp">
      <f:baz />
    </f:bar>
    <bar xmlns="urn:foo-corp">
      <baz />
    </bar>
These forms can happily co-exist in the same document, e.g. you can start at the top with one default namespace and then switch it in some branch into another.


My suspicion is that XML and friends get chosen for compelling social reasons.

It's not a language that anyone would want to do the actual software implementation in. That's important because, if you pick an existing language, then everyone who plans to use some other language is going to see that choice and assume it means that their language is either unsupported or will be a second-class citizen on that UI toolkit. They might not be wrong.

Creating a whole new language that's custom-tailored for UI layout avoids that problem, but will also be off-putting, since people will perceive that they have to learn a whole new language just to use your UI toolkit.

And just slapping it into a mess of library calls that people can call from any language is perhaps the worst choice of all. GUI toolkits, for whatever reason, always end up being object-oriented. Meaning that you've got to export an object-oriented API. Meaning you've got to either implement it in an object-oriented language that probably has a deeply incompatible and possibly not even standardized ABI such as C++, or you've got to go to great lengths to Greenspun an object-oriented system into a procedural language that can export a C-style ABI. Neither of those is fun.

In summary. . . yeah, XML is an awful choice. It's easily the worst option, aside from all the other options, which are even worse.


"The only real strengths of XML are that 1) it's declarative, 2) it's hierarchical, and 3) it's diffable. But code can be too, without all of these other drawbacks."

Code generally isn't declarative or hierarchical unless your language is those things as well. A declaratively defined interface in C# would end up being some like a bunch of strings or enums passed to some object constructors. At that point I'd rather a an XML implementation that can be verified and processed with tools.

"- For starters, it's overly verbose. If you've worked with stuff like XAML in WPF or XML layouts on Android, you know how quickly those files tend to get bloated."

I frequently write WPF code these days and don't find it any more verbose than attempting the same thing in C# statements. If anything, it's less verbose and it has the added benefit that it's all organized in the same group of files.

"At some point you will need to access your user interface from your application code. Which means you end up with silly stuff like findViewById() to bridge the gap, adding even more boilerplate to your code."

I don't know about Android, but in WPF, you just add a name attribute to your ui element and reference it by name.

"Editing XML kind of sucks. You're more at risk of making silly typos that won't be caught at compile time."

Hasn't really been my experience. Syntax errors are caught by the editor typically and further errors do get caught at compile time for XAML at least. Some errors don't get caught, but these are the same errors that would get missed if I attempted the same thing in C# code.


> Code generally isn't declarative or hierarchical unless your language is those things as well. A declaratively defined interface in C# would end up being some like a bunch of strings or enums passed to some object constructors. At that point I'd rather a an XML implementation that can be verified and processed with tools.

I should clarify and say code can be declarative with the correct language support. I agree that it's probably not well suited to C# in its current state.

> I frequently write WPF code these days and don't find it any more verbose than attempting the same thing in C# statements. If anything, it's less verbose and it has the added benefit that it's all organized in the same group of files.

Here are two examples I would consider overly verbose in XAML:

1. Triggers, DataTriggers, EventTriggers, MultiTriggers, etc. Here we're dealing with the fact that XML is not a programming language and lacks basic control structures, so we've gone ahead and defined our own mini-programming language just so conditional values can be expressed in XML. Why?

2. Anything having to do with overriding ControlTemplates.

> Hasn't really been my experience. Syntax errors are caught by the editor typically and further errors do get caught at compile time for XAML at least. Some errors don't get caught, but these are the same errors that would get missed if I attempted the same thing in C# code.

Ever make a typo in a binding?

I agree that Visual Studio is really good at helping you out where it can, but at the end of the day we're still trying to shoehorn programming constructs (objects, properties, enums, etc) into a document language format by shoving everything into XML attributes (where everything is a string) and there is a fundamental impedance mismatch between the two. Good tooling only goes so far.


"I should clarify and say code can be declarative with the correct language support. I agree that it's probably not well suited to C# in its current state."

Ok, but which language do you think would work? Lisp? I like the things you can do with macros and s-expressions as much as the next guy, but you're still left with the question of what should people using C# (which has neither of those things) use to make GUIs.

"1. Triggers, DataTriggers, EventTriggers, MultiTriggers, etc. Here we're dealing with the fact that XML is not a programming language and lacks basic control structures, so we've gone ahead and defined our own mini-programming language just so conditional values can be expressed in XML. Why?"

Correct, all the different sorts of trigger are verbose. But, to answer your question as to 'why?' the reason is because we need a deterministic way to express conditional formatting. If we attempted to accomplish the same tasks with hooks we'd have the problem of intermixing imperative, state-full, code with logic that is supposed to be more deterministic. I've written code like that for WinForms like that and it's ugly and hard to debug.

If you were to try to implement triggers with code behind C#, it would still be very verbose because you wouldn't be writing actual C# logic. You'd be doing something like declaring an object with trigger conditions and results and it would be at least as verbose as the XAML triggers, that is, if you wanted to accomplish the same thing with the same guarantees as with XAML triggers. The problem here isn't really the XML, but the fact that there isn't really an available language that solves the relevant problem. It's not like you could use an `if` statement.

"Ever make a typo in a binding?"

I have. I've also made typos when writing viewmodel objects that implement the `INotifyPropertyChanged` interface to similar effect.


You're clearly coming from a very C# centric viewpoint and reacting to my statements from a perspective of: "well, if I don't do it in XML then I'd have to do it in C# and that would be even worse". Which I don't necessarily disagree with. But just because doing it in C# would be worse doesn't make doing it in XML good either.

My underlying point is with platforms like iOS and Android adopting declarative UIs, combined with languages like Swift and Kotlin evolving in parallel to better support them, I think we will eventually be able to move on from being stuck with XML.


Its funny WinForms was just code. The designer generated the code or you code do it by hand.

You could read it and understand it because it was the same language you used. If there was a problem you could debug it.

I never felt more productive than when my UI declaration was just code.

Inevitably you need to mix some logic in the layout language and then debug it (For each render this thing, binding syntax etc.) now you have a weird programming language that is hard to debug and remember syntax for, or you could just use a normal programming language, maybe one that supports some declarative constructs.

SwiftUI gives me hope, would like to see something similar show up for the .Net World.


The IDE wrote code and kinda hid it from you in a partial class. And every once in awhile, one would have to get in there and muck around with the generated code. This had the potential to break things pretty badly.


Normally didn't do much in the partial class other than reference it.

You could override what it was doing in the main class file though if needed pretty easy. Again if it did break it was just aode and threw a normal exception in the designer, usually revolved around something that didn't work right in DesignMode so you had to check for that explicitly.


If you use F#, Fabulous[1] for mobile apps and Fable+Elmish[2][3] for web exist. I use Giraffe's[4] view engine for server-side HTML templating. And it looks like AvaloniaUI is going to have a similar style UI in FuncUI[5], but I'm not sure how active that is.

I'm not sure about any C#-focused efforts.

1: https://github.com/fsprojects/Fabulous

2: https://github.com/fable-compiler/Fable (transpiles F# to JS)

3: https://github.com/elmish/elmish

4: https://github.com/giraffe-fsharp/Giraffe

5: https://github.com/JaggerJo/Avalonia.FuncUI


"WinForms was just code" This conflates source text and product. Generated code was _output_ by the WinForms WYSIWYG editor, but the tool stored the layout source in a native project file format.


You could also write forms by hand, the WYSIWYG editor just visually represented the code and modified it using a bunch of metadata (again just .Net attributes) to help the designer.

The GUI designer actually just ran your code to display in the designer, which mostly worked, sometimes it could break if you form relied on runtime dependencies and you could code around that by specifically checking for DesignMode == true in your code.

I am not saying this was perfect but it had a lot of advantages. Now you see SwiftUI and the like going back toward that model.


I think qt widgets (not-qml, it's different and I haven't used it) uses XML without many of those problems.

Layouts are trees of widgets. Layouts are themselves widgets so repetition in the tree can be reduced.

You use XML, C++, or both to define a widget. The UI Compiler will turn that XML into C++.

That means further code reuse can leverage the C++ type system. All widgets are QWidgets, and MyProgressBar inherits from QScrollBar.

And you (or, at least I) never write .ui files by hand, they've got a GUI designer tool. And the tool supports your custom widgets by treating them the same as the known widget type it inherits from.

Accessing XML-defined widgets from code is the same as C++ defined widgets, because the compilation step means they are. Class member names come straight from the XML file.

The QObjects are constructed in a way that understands the UI hierarchy, and you can programmatically walk that hierarchy using the variable names. (You do have to tell objects their own names in the non-XML-generated code.)


Qt is really good for doing UI, in my opinion, both QML and non-QML. They do a really good job at keeping things coherent and concise, which really helps first-time users, specially after learning the basics.


After using Flex for a couple of years I have to agree. There always seemed to be two ways to do things: 1) Through the XML 2) Through the code. That may seem appealing initially but gets really ugly when certain developers do certain things with XML and others code. It basically doubles the API space and syntax.


I fully agree with this statement. My work is generally for the web, and HTML has the same issues. It's amazing how much removing a single tag in a nested tree can mess up VSCode's parsing engine. Just modifying the tags usually works on both open and close though.


One point I try to get across to people working on the web is that they should think of HTML as a render target, not as a layout tool.

The point of HTML isn't to make it easy to lay out a document, it's to have an easily understandable pure-text representation of hierarchical data that can be read by computers/users. Light, well-formatted, semantic HTML isn't an engineering concern, it's a UX concern.

Of course HTML can be used as a layout tool, but that's not its primarily purpose, otherwise it would include decent 2-way data bindings for tables.


One more interesting example of XML based GUI system that avoids most of the problems listed above is HaxeUI. The same way as Qt it solves them using compile time code generation.

Accessing objects from code is not a problem due to compile time code generation, no findById just access the field. Due to the powerful macro system in Haxe you even get in code autocomplete for objects defined in xml before running compilation.

For code reuse it allows easy way of defining custom widget classes based on xml file and using custom widgets in xml file.

In my experience with Qt bloat usually comes from graphical editors that add unnecessary properties you wouldn't have set in code or GUI systems that don't support creating reusable subcomponents.


I agree. I think the argument isn't json/xml/yaml/crazy it's why isn't this just code.


Because sometimes you want to pass data between blobs of code, (possibly separated by time, environment or language), or even to and from humans who may not understand code.


of course, but for a ui?


> For starters, it's overly verbose. If you've worked with stuff like XAML in WPF or XML layouts on Android, you know how quickly those files tend to get bloated.

I've done a bit of WPF and a fair amount of Qt Widgets, and never once did the actual XML content matter since it was mostly putting the control at a place that looks good in a UI designer.

> Which leads to more bloat.

I'd like to know your definition of "bloat".It's not like 100kb or 500kb of XML are going to translate into 100kb or 500kb executables or runtime memory use, since the XML is generally only used at compile time before translation into code.

> Editing XML kind of sucks. You're more at risk of making silly typos that won't be caught at compile time.

well, Qt's main IDE marks the UI xml files as read-only so not much chance of that on that side.

> At some point you will need to access your user interface from your application code. Which means you end up with silly stuff like findViewById() to bridge the gap, adding even more boilerplate to your code.

... no, you will just refer directly to a variable that was generated in a precompilation pass - you can inherit from or compose the generated object depending on if you prefer to type `ui->myLabel.setText` or `this->myLabel.setText`.

> You essentially have to learn two different sets of APIs which do the same thing. <Button title="blah"/> vs button.setTitle("blah"). Why?

You don't if you just use the UI designer.


Chiming with my prior experience: My problem with Qt's approach is that their editor is atrocious despite their XML schema for UI layout being excellent.

Qt's XML schema makes sense and very easily translates to the C++ it generates. If one wishes they can write the XML themself using just their knowledge of the Qt Property System, the Layout System, and the very basics of how the XML is laid out. If they want to just use a designer tool, that's fine too however Qt's UI designer makes life very difficult from the perspective of version control and actual usability.

The UI designer doesn't respect any Canonical form of the UI files. This means that each time you save the file it more or less jumbles the entire thing up based on however it decided to hold it in memory. This turns tiny changes (what should be maybe 2-3 lines of XML) into massive ones (100s to 1000s of lines of XML shuffled just enough to shake off any semblance of contiguous history in the VCS). Likewise, it adds arbitrary property data to items that 95/100 times are unintended (setting default geometry sizes based on the sizes of the UI elements in the demo window) or are simply meaningless (storing the default value as a user selected property despite the user requesting to use the default).

It doesn't help that the UI of the designer is finicky as hell (layouts) and adding custom UI components is a hassle via the editor while it is exceedingly easy in the XML.

It's a bunch of small things like that that have resulted in teams I've worked on in the past refusing or at least strongly discouraging the committing of changes made by UI designer into VCS. It's really a shame too since a good UI editor is a boon for development and Qt has a really nice UI library.


I'm writing Flutter apps with Dart, and not having to use XML (or HTML+CSS) for UI layout has been a breath of fresh air. It's just code. I really like it.


- The verbosity is easily addressed by using an IDE which providing smart folding, highlighting, etc...

- Editing XML is actually a lot easier than JSON. Thanks to XML schemas, completion works really well in XML and very poorly (if at all) in JSON.

- Also, XML allows comments, which JSON doesn't.

- You don't need `findViewById()` in 2019 and haven't in a while, really: there are plenty of alternatives available (synthetic accessors, Kotlin, ButterKnife, etc...).


I'm definitely not advocating for JSON here, that would be even worse.


I don't think the person you're replying to is arguing XML vs JSON as much as UI in markup vs UI in native code.


I want to address that reuse/repetition thing. It's totally possible to do this in XML+XSLT. E.g. you have some repeating structure that totally calls for what would be a helper function in a conventional language, say, "createMyButton(this, that)". In XML that could be done like this:

    <android:whatever xmlns:ext="urn:my-corp/ui-ext">
       <ext:my-button this="..." that="..." />
    </android:whatever>
This gets fed into an XSLT file that copies everything intact and expands your custom elements:

    <xsl:template match="ext:my-button">
      <!-- expand into what you need -->
    </xsl:template>


Ahhhh the never ending should UI live in the code, or should it live outside the code debate. There is definitely a cohort of people who find React and code based UI patterns better. There is also a cohort of people who prefer to keep the two separate. I tend to side with the former architecture, just from years of experience building software where keeping them separate actually doesn't win you anything. Yet I've met engineers who swear by XML/Interface Builder/VB style development. Hey, as long as you can build your software and get the job done and the tools are helping you get the job done, who am I to tell you what pattern to build with?!


UI frameworks that expose a findViewById function could be considered anti patterns in the first place. For a declarative view the dependency should go the other way around with the view reactivelly accessing the viewmodel, instead of (accidentally) embedding UI boilerplate in your business logic. To react on UI changes in your logic you can add listeners to changes made on the viewmodel. See jQuery vs react/angular/vue for best example of the difference.

Android Jetpack also tries to bridge the gap of this in android, even though here you manually have to write some glue code yourself to create this reactive layer.


None of the flaws you list apply to TSX, which essentially gives you the best of both worlds. I really wish SwiftUI had used JSX-style syntax, just because of how nice it looks.


Technically, tsx / jsx is just a macro to convert an xml-like syntax to plain javascript objects... Neither xml nor json.

Inline functions and classes dont exist in either, JSX doesnt have to deal with xml's namespaces, and you dont need to construct an entire document- fragments are defined in functions at as granular a level as you like.

I do agree personally that I prefer tsx over the other alternatives, just pointing out that it is a pretty nice step up over something like flex or xaml.


There are languages and languages here, and your mileage will actually vary by orders of magnitude.

As people already pointed, code isn't inherently declarative or non-verbose. Some of the most popular languages for creating GUIs (Java, JS, C#) aren't either.

> It does not lend itself well to reuse

There isn't any inherent feature for code that will enable the kind of reuse you need when declaring a GUI either. You'll get it naturally in Lisp, you may be able to hack something in Haskell, and you can create a hacky DSP in in Python or Ruby, but if your GUI is in Java, CS, or JS you are out of luck.

> silly typos that won't be caught at compile time

That's a tooling issue. At Python, Ruby or JS you it is inherently more an issue than on XML, at Lisp it will vary with the actual idiom.

> At some point you will need to access your user interface from your application code

> You essentially have to learn two different sets of APIs

Well, that's bad design. There's nothing (except well, a lot of work) stopping a library from making the entire interface manipulation by actions made over XML. Granted, that is much easier and more natural in a DSL or in declarative reflexive code, so you do have a point, but it can be fixed by a lot of polishing over the worse fundamentals.


Yes I agree that building declarative UIs in code does require language support and some languages are not well suited to it. But I think this is an area where we could see real improvement and we should want our tooling to evolve to better handle these problems.


Oh, I completely agree that UI declared by code has much more potential than a frozen approach like UI declared in XML. It's only that presently, it's not a clear win.


Sorry, but

    <ol>
      <li>...</li>
      ...
    </ol> 
is more verbose than

    new OrderedList(
      new List<ListItem> { 
        new ListItem(...), 
        ... 
      }
    );
??


Well really that could very easily become:

  ol(
    li(...),
    li(...)
  )
Depending on framework.

But honestly, does it really matter? Either choice isn't terrible at all.


just use hyperapp then.


Anything that builds the UI at runtime is silly. Your UI isn't going to change at runtime (other than in a few predictable spots), why would you waste processing time dynamically calculating it.


Not only that, XML actually diffs poorly -- line-by-line diffing of the sort git does makes it hard to see that a single attribute was changed in a long line.


What UI framework using XML is there where you can end up making silly typos that won't be caught at compile time? Or basically any sort of typo? I would expect any decent XML UI framework is going to have some sort of validation schema defined which will be quite stringent as to what is allowed at a particular level. Not that I like XML Schema.


I'm only familiar with XAML+WPF+C#+VS.

One particular pet peeve of mine is the fact that it won't catch typos in data binding. If I have IsEnabled={Path IsEnabeld} in my XAML, the designer won't catch it. At runtime, the error is silently ignored. It just doesn't work, and no tooling, nothing other than triple checking every line of code will catch it.

Also it's really bad about picking up dependencies cross package, although it might be a problem in my org's config.


ok thanks, I was just thinking of elements and attribute names as being important.


Primarily the two I originally mentioned, WPF and Android, both of which rely on configuration of controls through XML attributes, which are all strings and not necessarily checked for validity at compile time.


It also has the advantage over code that it can be modified with a visual interface designer.


There's no reason this can't be done with code as well. See: SwiftUI


System.Windows.Forms works in the same way (the UI is serialized to code). WPF was designed many years after that because it allows different disciplines to work together.

Designers would most likely be familiar with XML given their HTML background, but they wouldn't necessarily know C#.

If Swift UI expresses the full fidelity of the API in the designer (e.g. expressing custom states and animations, keyframes, etc. on a custom button), then my point doesn't apply. If the full range of designer responsibilities cannot be achieved without dropping into code, then Swift UI isn't a real counterexample.


> Designers would most likely be familiar with XML given their HTML background, but they wouldn't necessarily know C#.

I am really curious how often XAML has been used by a designer who didn't know C#. Most of the XAML I've seen (in non-trivial apps) has a lot of code-like constructs in it. If you don't know what's going on under the hood, how are going to correctly declare your data bindings (for example)?


> Most of the XAML I've seen (in non-trivial apps) has a lot of code-like constructs in it.

Those code-like constructs are UI behavior (e.g. how to transition from checked to unchecked for a checkbox). While they do take learning, they are not strictly code.

Also, Expression Blend can interact with every part of XAML in full-fidelity. Designers need not worry about the code-like parts of XAML at all.

> If you don't know what's going on under the hood

I suppose the workflow is that the design team would log a ticket for the development team, who would expose said binding and add it to the UI as a placeholder. I'm not sure exactly how this is supposed to work, but Microsoft did have a project which combined Unity and WPF to solve this exact workflow (I forget the name), and they were using it internally.

---

Personally, I can see the benefit as I do work at a product company. Developers get bugs pretty often to do with "element X is 1 pixel off," usually from the design team (one designer was known to open screenshots in paint to measure distances). If you can empower the designers to just create real UIs instead of mockups, it would get done correctly in the first place. Vue SFCs come close to achieving this, but fall short in terms of state (you still need JS to add/remove CSS classes).

Either way, it isn't fair to compare Swift UI to XAML as Swift UI doesn't seem to attempt to solve this at all (even if XAML doesn't solve it perfectly). Swift UI is closer to System.Windows.Forms.


I'm pretty sure Delphi and VB were famously successful in spite of the fact that they didn't use XML for UI layout.


You know, when XML looks like this, I don't mind it so much:

    <Employee name="Michael Scott" title="Regional Manager">
      <Employee name="Dwight Schrute" title="Ass. Regional Mgr" />
      <Employee name="Jim Halpert" title="Head of Sales">
        <Employee name="Andy Bernard" title="Sales Rep" />
        <Employee name="Phyllis Lapin" title="Sales Rep" />
      </Employee>
      <Employee name="Pam Beesly" title="Office Administrator" />
    </Employee>
The thing is, XML in the real world never looks like this. In the real world, XML has an avalanche of obscure namespaces, weird "<![CDATA" things, unreadable attributes, incomprehensible structure, and god knows what else.

The simplicity of JSON is an advantage: it prevents people from mucking it up with too much nonsense like this. Of course, you CAN make unreadble JSON as well, XML just makes it so much easier.


> The simplicity of JSON is an advantage: it prevents people from mucking it up with too much nonsense like this.

JSONs simplicity was out the window before it became a formal spec. The original designer of JSON omitted comments from the final spec because people were using them to create processing directives.

There are multiple differing implementations of JSON parsers, the most notable differences between them being some allow comments while others treat them as errors. There's also JPATH, JSON Schema, and JSON Transform.

JSON is the new XML. It has been and will continue to be abused to fit all the possible corner-cases, at the expense of simplicity and readability. Just like XML.

EDIT: Oh, and instead of CDATA, we now simply base64 encode arbitrary objects and pack them into JSON encoded strings. Is this really an improvement?


>EDIT: Oh, and instead of CDATA, we now simply base64 encode arbitrary objects and pack them into JSON encoded strings. Is this really an improvement?

I think that's more of an indictment of HTTP than XML or JSON. Maybe we should have a better standard for exchanging non-textual data with websites.


CDATA still does not allow you to include even arbitrary sequences of unicode codepoints, not to mention arbitrary binary data.

On the other hand HTTP is in fact a binary protocol and there is perfectly good way to transport sets of binary objects with associated metadata: MIME multipart/whatever. With multipart/byteranges even being part of HTTP itself and multipart/form-data being one of the "core web technologies".


No it's not. It's because people like to just pipe it into same request framework, get a json object and then later do base64_decode(json["binaryobject"]) instead of a separate http request to do binary transport which it can already do.


> <Employee name="Dwight Schrute" title="Assistant to the Regional Mgr" />

Fixed that for you :) https://www.youtube.com/watch?v=wA9kQuWkU7I


Such a missed opportunity to include Jim in the favorite movies examples with movies like Clear and Present Danger, Patriot Games, etc...


Is this valid HTML, anyone?


That still looks extremely confusing.. if I didn't know who those people were, I would have a really hard time even noticing that there is a hierarchy in what you presented.

And even still, that just doesn't look intuitive to me. Probably because there is no distinction between an employee and a list of employees linked to a parent employee?


The structure of XML (which element contains which) was meant to be used to link these elements to the underlying text they're marking up. When there's no text and just metadata (elements), there's no natural structure for them. It can still exist, of course, but what it means is totally up to the designer of this XML format. It's piggybacking the structural relation for something else. In this example it's piggybacked for "boss-employee" relation, but it could totally be flat like that:

    <Employees> 
      <Employee id="1" name="Michael Scott" title="Regional Manager" />
      <Employee id="2" boss-id="1" name="Dwight Schrute" title="Ass. Regional Mgr" />
      <Employee id="3" boss-id="1" name="Jim Halpert" title="Head of Sales">
      <Employee id="4" boss-id="3" name="Andy Bernard" title="Sales Rep" />
      <Employee id="5" boss-id="3" name="Phyllis Lapin" title="Sales Rep" />
      <Employee id="6" boss-id="1" name="Pam Beesly" title="Office Administrator" />
    </Employees>


> JSON for lists, XML for trees No. JSON is a recursive data structure that can represent trees as well.

It's easy to make a concise representation of DOM trees as JSON: http://m1el.github.io/jsonht.htm

I find it really sad that people don't know or consider s-expressions to represent trees.

    (Department {:name "Scranton Branch"}
      (Employee {:name "Michael Scott" :title "Regional Manager"}
        (Department {:name "Sales"}
          (Employee {:name "Dwight Schrute" :title "Ass. Regional Mgr"})
          (Employee {:name "Jim Halpert" :title="Head of Sales"}
            (Employee {:name "Andy Bernard" :title="Sales Rep"})
            (Employee {:name "Phyllis Lapin" :title="Sales Rep"})))
        (Employee {:name "Pam Beesly" :title "Office Administrator")))


This reminds me of an old joke about binary XML:

   <byte> 
   <bit>1</bit> <bit>1</bit> <bit>0</bit> <bit>1</bit>
   <bit>0</bit> <bit>1</bit> <bit>1</bit> <bit>0</bit>
   </byte>
>It's easy to make a concise representation of DOM trees as JSON: http://m1el.github.io/jsonht.htm

It is not concise. Here is an example of a single div tag overhead:

   ["div",{"class":"some classes"}, ... ]

   <div class="some classes"> ... </div>
Also, it does not represent DOM trees as JSON. It represents DOM trees as S-Exressions encoded via JSON. The s-expression above can be written directly as:

(div (class= "some classes") ... )

So why does it need to be wrapped in a format that's verbose, underspecified, doesn't have comments, doesn't have namespaces and doesn't have any notion of metadata?


> So why does it need to be wrapped in a format that's verbose, underspecified, doesn't have comments, doesn't have namespaces and doesn't have any notion of metadata?

Because the OP article doesn't even consider s-expressions, so I'm working with the tools I was given.

Personally, I find it frustrating that EDN (or some other form of s-exprs) aren't widely adopted.


Not specified enough, I think you mean:

    <ieee-754:byte xstdversion="2008" nstid="two's complement" _archext="x86-64">
       <ieee-754:bit arity="2">1</ieee-754:bit>
       <ieee-754:bit arity="2">0</ieee-754:bit>
       ...


There's a beautiful symmetry between s-expressions and XML which is why it's not really all that annoying to work with once you approach it as an s-expression.


Working with Clojurescript and Reagent (or Hiccup) has ruined JSX for me. It's so verbose and absurdly hard to work with relatively.


Why is JSX so much more verbose? The end tags?


It's primarily the end tags, yes, but there's also the className shorthand in reagent that's nice.

[div.btn.blue.p-4 {:disabled true} "Button Text"]


This reminds me a lot of Elm's syntax.


What XML is also very good at is: representing tagged text, what is called "mixed content" in XML terminology.

Just try to write the JSON equivalent of:

    <div>The <a href="https://www.json.org/">JSON format</a> was invented by <em>Douglas Crockford</em>.</div>
If your document consists mainly in losely structured text with some annotation, you better use XML.


{ "type": "div", "children": [ "The ", { "type": "a", "attributes": { "href": "https://www.json.org/" }, "children": [ "JSON format" ] }, " was invented by ", { "type": "a", "children": [ "Douglas Crockford" ] } ] }

I think I prefer the XML :-)

Edit: I agree that tagged text is a clear win for XML - for pretty much everything else I'd go for JSON.

Edit2: HN ate my formatting, but probably not a bad thing ;-)


["div", "The ", ["a", {"href": "https://www.json.org/"}, "JSON format"], " was invented by ", ["em", "Douglas Crockford"], "."]


Quite readable but it's a hell to parse. The first argument is the name of the element, but what is the second one? If it is a string or an array, it's the first child, but if it's an object, it's the attribute set?


Maybe something more like:

    [
        "div",
        [
            "The ",
            [
                "a",
                { "href": "https://www.json.org/" },
                [ "JSON format" ]
            ],
            " was invented by ",
            [
                "em",
                "Douglas Crockford"
            ],
            "."
        ]
    ]
With all children being encapsulated in an array...

edit: though it again gets confusing when the 0 index is sometimes the element type and sometimes the straight text... so never mind I guess the problem persists.


That's is roughly how it's done in Reagent[1]/re-frame[2]:

  [:div "The" [:a {:href "https://www.json.org/"} "JSON format"]
   " was invented by " [:em "Douglas Crockford"] "."]
The fact that EDN[3] supports keywords makes it a bit easier to parse. Representing HTML in EDN this way was first done in a library called Hiccup[4], so it’s usually called “Hiccup” even when encountered outside of the original library.

1: https://holmsand.github.io/reagent/

2: https://github.com/Day8/re-frame

3: https://github.com/edn-format/edn

4: https://github.com/weavejester/hiccup


How would you ever differentiate "div" from an element or text? Is "The " text or the <The > tag?


Why not just:

[ "div", "The ", [ "a", { "href": "https://www.json.org/" }, "JSON format" ], " was invented by ", [ "em", "Douglas Crockford" ], "." ]

First thing in array is element, if there is an object at the second location then that's the attributes?


Yeah that seems to make the most sense. I had to actually work it out before I saw it. I was looking for some further separation beyond indices but I just ended up over-thinking it.


> The first argument is the name of the element, but what is the second one?

If the second argument is an object/dictionary, it's attributes. Otherwise, it's the first child. Alternatively, the ambiguity can be resolved by using null/0 instead of empty attributes.


What if text were quoted?



["div", "The", ["a[href=https://www.json.org/]", "JSON format"], "was invented by", ["em", "Douglas Crockford"], "."]


that looks like the JSON documents in AWS's CloudFormation


Ok, now try <div>div</div>


["div", "div"]


Interesting.


Check out hiccup


Amazing. Who would've thought a markup language would be appropriate for the task


Are you sure you want hierarchy though?

  [
    ["the ", {}],
    ["JSON format", {"link": "https://www.json.org/"}],
    ["was invented by ", {}],
    ["Douglas Crockford", {"bold": true}],
    [".", {}]
  ]


In this particular example you need hierarchy for on at least two occasions.

1. The xml denotes that the whole thing should be wrapped in a div, and shouldn't just be considered a paragraph. With only lists you can't denote what kind of level you're at, unless you force each level to always carry the same meaning (which makes the format weaker than XML).

2. Setting "bold" to true isn't the same thing as emphasizing the text. Typically 'em' is italic, but more importantly a nested 'em' should no longer be italic. To achieve this with JSON you're forced to combine formatting and structure in a way that CSS was designed to prevent.


The information stored in this XML is essentially that:

    { "string": "The JSON format was invented by Douglas Crockford",
      "structure" [
        [ 0, 48, "div", {} ],
        [ 4, 13, "a", { "href": "https://www.json.org" } ],
        [ 32, 48, "em", {} ]
      ]
    }
(If we allow empty elements, we'd need another field in the structural entries to be able to reconstruct the hierarchy.) Now I'd say that XML way to combine these three kinds of data is pretty elegant and natural. The JSON would be impossible to write by hand, although it's good for machine processing.

Upd: The three kinds of data: the string, the structures (div, a, em), and their position in the string.


It's funny to me that even in an XML shining example of purpose you've got non-semantic "<div>" there likely because, "just float where I want, dammit!"

Just imagining a JSON that starts with

"dummy": "don't remove this or the API will mis-align this across two structs!!!"


You jest, but this is not unheard of in hand-maintained JSON:

  "comment": "The foo widget needs blah blah ...",


The semantic of these things is what the code does with them. It's the only semantic there is. E.g. the <i> tag has perfect semantic, while <article> not that much.


So Jupyter Notebooks should be XML instead of JSON. Right ?


AFAIK notebooks are a list of cells, each cell consisting in text, but there is no meta information within the cells, so JSON looks OK there.


Honestly, they should be markdown. This would also solve the versioning problem.


And where would you store the output?

R uses markdown + a separate file with the output cached and while the non-rendered file is easier to version the whole thing is a lot more difficult to handle.


Well yep, but that ship has long sailed.


It is ugly:

["div", 0, "The ", ["a", {"href": "https://www.json.org/"}, "JSON format"], " was invented by ", ["em", 0, "Douglas Crockford"], "."]


Interesting to see the history here: XML was created and solved a lot of problems. So on the bandwagon everybody jumps. Then it became bloated and fragmented, so the world switches mostly to JSON.

But on the XML side, the baby was thrown out with the bath water, while JSON started developing its own abuses. TOML and YAML come in and muddy the waters a bit more.

This is only 1 article, but maybe the world is ready to discuss the possibility that the pendulum swung too far?


The moment we added JSON schema (not to mention JSON transforms) and started to create a JSON version of XPath, we started down the same path that hurt XML.

Throw in parsers that all act differently (some parse comments, some don’t), and processing directives, and it becomes clear that yes, the pendulum has swung too far.


> The moment we added JSON schema

There is an amazing JSON schema syntax out there, 100% better than the official one. It is called TypeScript, and I am sad that the TS compiler can't be made to pop out JSON validators (their philosophy being nothing at all runtime).

There is a project (I forget its name) that will insert itself into your build steps and use TS type definitions to validate JSON.

Seriously the TS syntax is great, why is anything else being used to write a schema?

The only thing I'd add to the TS syntax is the proposals to add field validation using regexs, or some type of field name validation format. There are times when TS can't model my JS objects, e.g. I have a bunch of GUID field names that map to objects of a certain type, and I have some other field names that are 100% not GUIDs, I promise, mapping to some other stuff.


Imagine you're Amazon and you receive data from vendors. It's not enough to authenticate the sender, you still need to check that the payload is at least syntactically valid before starting to process it. You can write custom code, of course, but after writing lots of it you're likely to start naturally splitting it into a declarative part and a universal processor that checks incoming data against the declaration. It's harder to come up with a declarative way from the start, but it's a natural tendency, because declarative form is much simpler than procedural: declarations are data, and data are easier to understand than a process.

This is what schema, transforms, and paths are: they're a declarative way to address these tasks in XML. It's natural there will be attempts to reinvent them for JSON.


I’ve been around long enough to use them all. I’m pretty happy with JSON as a debuggable serialization format for machine-to-machine or light machine-to-human use cases. YAML works well for human-to-machine (JSON could fill this use case as well if it would support comments and multi line strings). I don’t use TOML much, and good riddance to XML.


TOML is simply the old INI format except with a single concrete specification instead of many adhoc ones. YAML is the "kitchen sink included" of serialization formats, so much so that most only use a small subset of its features. In that case I think TOML is usually the better choice unless you really do need some of the power and complexity offered by YAML.


I agree that less power is better, but I find TOML pretty cryptic. Like I said, JSON with comments and multi line strings is my ideal. :)


What do you feel is cryptic about TOML? I always felt it was pretty straightforward, especially if you're coming from INI (you could feed INI files into a TOML parser and most of the time end up with something somewhat sane, especially if you're using octothorpes instead of semicolons as comments).

That aside, I'm personally a big fan of using Tcl-like / shell-like syntax for configuration, since it enables some pretty rich and expressive config directives while not being insanely verbose like most attempts at using XML for this. It's one of the things I really like with OpenBSD's various subprojects like PF¹ and OpenSMTPD² and relayd/httpd³, and I'm semi-actively working on a way to do similar things in Erlang/OTP applications⁴. The only alternative that matches that level of expressiveness (besides just doing all configuration directly in the host language) is s-expressions.

¹: https://man.openbsd.org/pf.conf

²: https://man.openbsd.org/smtpd.conf

³: https://man.openbsd.org/relayd.conf.5 / https://man.openbsd.org/httpd.conf.5

⁴: https://otpcl.github.io


I'm not coming from INI files. :) I actually always felt they did a poor job of representing anything more than key value pairs and they weren't standardized, so I always managed to avoid learning the grammar(s). I'm sure if I put much effort into it, I could figure out TOML, but I don't want to ask my users to figure out yet another configuration language.

If I'm going to make them learn another configuration language, it's going to have something more to offer than just 'simpler than YAML'. Maybe Starlark for more powerful configuration applications (think infra-as-code where YAML/JSON/etc clearly isn't powerful enough) or if someone ever builds it, "JSON with comments and multiline strings" which offers simplicity without compromising familiarity. Actually, it would be really interesting to hear what people's ideal configuration languages would look like.

> That aside, I'm personally a big fan of using Tcl-like / shell-like syntax for configuration, since it enables some pretty rich and expressive config directives while not being insanely verbose like most attempts at using XML for this.

I'm not really familiar with this. I'll have to read up.


TOML is what I'd naturally write as notes if I wanted to document how something is working.

YAML is all sorts of strange.

JSON needs comments and support for trailing commas. Since that isn't going to happen.... TOML!


"Good riddance to XML" is a bit much. Did you read the article? XML is ideal for representing tree structures. Pick the tool for the job.

For example, XAML does an excellent job in declaring your UI. Its use of namespacing, and how you can use child nodes to declare properties of an object, makes it highly flexible albeit slightly verbose.


Using child nodes to declare properties of an object is the main reason I dislike XML. Because some people use children, and some use attributes, and some use both. Granted, that isn't XMLs fault, but every time I to have get XML data into a usable format it feels like a struggle.


> For example, XAML does an excellent job in declaring your UI.

JSX does a better job.

I hate XAML, it doesn't add anything except verbosity over just using C#. The original goal of XAML was to allow bindings to multiple programming languages and to allow designers to use a suite of amazing tools to finished designs off to developers for implementation.

What ends up happening is developers write XAML, then they write C# to back up what XAML can't do, then they have a complex build stage to shove all this crap together, but instead of XAML just being transpiled to C# it has its own run time thing that hosts compiled XAML files.

JSX by comparison is a trivial transformation to JavaScript. It is easy to read, and it does not try to replace JS control structures.

In fact you could probably trivially transform a JSX syntax to ANY C type of language, I know there is a proposal (implementation?) for a JSX-like to Dart.

The sheer awesomeness of JSX is hard to describe if you haven't used it. It is like something asked "what is the simplest, easiest to understand templating language we can possibly create?" and out popped JSX.

My favorite part of JSX is how it doesn't try to have control structures in it. Everyone else gets this wrong and adds some sort of string types DSL to their templating language.

JSX doesn't do that. If you want to map over an array and spit out a bunch of elements, you just do so in JavaScript/TypeScript.


Interesting, I'll have to take a second look at JSX. So far I feel like in certain cases XML is better at declaring objects and their properties (along with their Types). XAML can declare any type and configure it as needed, it is a way of declaring components in a logical fashion. I wonder if JSX/alternatives are a true superset of XML-derived formats. FYI, XAML generates a binary format at compile-time. From my many experiences with desktop applications, the additional code for XAML/UI is required regardless of the languages/frameworks, to control things like animations, behaviors, markup extensions.


> XAML can declare any type and configure it as needed, it is a way of declaring components in a logical fashion.

XAML is a lot like Python object constructor syntax with all named parameters except one positional parameter that is a list of child objects (obviously, explicit closing tags rather than a closing square bracket and paren is a difference, but structurally it's m almost identical.) It's certainly a convenience for declaring certain kinds of object trees, especially if your main programming language doesn't have a similarly succinct object construction syntax. JSX is effectively a way of adding an equivalent object constructions syntax to the host (JS) language, rather than requiring it to be in separate code files.


Years of experience > "the article"

JSON is a tree structure. It is dictionaries and arrays mixed together in a hierarchy.


Your opinion != Everyone's opinion

Don't get me wrong, I prefer JSON in most cases. But XML does have many valid use cases. Perhaps you haven't had an opportunity to see its beauty. You can't blindly say JSON > XML or JSON < XML, or "XML sucks" -- this shows ignorance.


I didn't say any of that. JSON is a tree structure, that isn't an opinion.


My bad, I think I was responding to another comment.


You’re welcome to your opinion. I disagree that XML is ideal for representing tree structures or anything else for that matter. I’ve used XML a lot (including for UIs and other tree-like data structures), but ever since JSON became popular I haven’t found myself missing it.


YAML predated / was concurrent with JSON. Both were in reaction to the bloat of XML.


It's funny that YAML spec is three times as long as XML 1.0.


The XML bloat is on the processing side.


Or rather, was. At the time one could either:

- write a sax event based parser that scaled but was low level

- use a DOM parser with a fairly convoluted API. The combination of attributes and children makes for wordy accessors.

Both YAML and JSON provide formats that more readily deserialize into native map / list / string / number types which at the time was quite convenient.


XML is so awesome for UI layouts. I had a thesis that I worked on that was basically a subset of HTML (check the SMIL specification for something similar, albeit a bit more complicated, SMIL used to be a W3C recommendation).

It was so easy to understand that every person who was able to use a computer would pick it up quickly. This was not the case for HTML, my guess is because there are a ton of HTML tags and that can be overwhelming for a certain type of audience, whereas here there are at most 10 tags to learn.

The framework I'm talking about still exist, it's called XIMPEL [1] (sadly SMIL kind of died). It may seem like a weird thing, but IMO it's the only open-source framework that has an intuitive way to make non-linear storytelling easy on the web easy. People can kind of hack a choose your own adventure media story with YouTube, but XIMPEL really is so much better at it because it's actually made for that purpose. And the biggest reason why XIMPEL is simpel because it uses XML as its template language.

People just see the tags as lego blocks and build non-linear media essays, stories or even small (media-focused) games with it.

[1] http://www.ximpel.net/


As someone who has only used html (as in, no other xml-based UI), I'm really curious why it's necessary to even define the tags at all.

Why not allow authors to define their own names? Like

  <SomeWrapper>
    <SomeChild />
  </SomeWrapper>
Then internal functionality can be attached to each element via attributes, like how aria roles work.



Yep, to some degree. Though what I would love in an xml-based UI description language is the ability to opt into default system behaviors, while maintaining the ability to use whatever semantic naming convention that fits my application.

I wrote this code in another comment, but imagine if you could hook functionality to an element via defined attributes. With custom elements, you have to add the functionality via a script, and they aren't system functions (from what I can remember, I may be wrong).

  <LabelList list>
    <LabelItem label listItem>
    </LabelItem>
  </LabelList>
In the above code:

- the LabelList element hooks into the systems "list" behavior

- the LabelItem element hooks into the systems "label" behavior

- the LabelItem element hooks into the systems "listItem" behavior


Maybe the is="" attribute gets you partway there?

    <ul is="label-list">
        <li is="list-item">foo</li>
    </ul>
EDIT: google's documentation: https://developers.google.com/web/fundamentals/web-component...


The tags are defined so that there is reasonable default behavior. For example <label> is used to label a form field. Clicking the label focuses the form field.


Would be interesting though to opt in to those behaviors. So something like

  <MySemanticLabel label>
  </MySemanticLabel>
Then you could mix and match functionalities. So if you have a list of labels, you could do the following:

  <LabelList>
    <LabelItem label listItem>
    </LabelItem>
  </LabelList>


You do opt into them, by using label. if you want an element with no behavior, that's what div and span are for.


Right, but this offers two benefits: (1) semantic naming, and (2) composed functionality. So for example, if you wanted an element that exposes label behavior and list-item behavior, you could do that. With html, you have to use 2 elements:

  <li>
    <label>
  </li>


Yes, this is a major flaw of HTML, it is very specific about where certain elements can and cannot be, so you can't really even design a solution like the above.


Congrats on re-inventing frontend web development, I heard Facebook's hiring. This is one of the reasons people use frontend frameworks (besides state management).


Why the hostility? Should we not be open to exploring alternate ideas?


I should add that I'm not literally saying that we should do this. I assume smarter people than me have thought this through, and that there are good reasons why my proposal would be a bad idea. I'd be interested to know what those reasons are.


I am not intentionally being hostile, just pointing out that your stated deficiency is the one of the reasons why people migrated to component-style frameworks. It's an affirmation of your comment's validity not a criticism.


> It was so easy to understand that every person who was able to use a computer would pick it up quickly.

I find this hard to believe. I know many people who use computers but are not IT professionals or even hobbyists. Without an easy to use GUI they can't accomplish much. I couldn't imagine them picking up anything to do with XML.

If you scope it down to UI developers you're taking less than 1% of people. Far from everyone. Even if you scope it down to IT people it's less than 1% of people.


There’s the always forgotten type of computer users: those who use computers for productivity. People who need to achieve some business goal and they have the energy to learn.

Take for instance insane Excel spreadsheets glued with VBA powering entire businesses, built by accountants and other professionals.


It’s not that they can’t. They’ve just been told they can’t.

I’ve been told that clerks at bell labs were using troff because they were told they weren’t programming and I think GUIs have the same opposite effect.


I don't disagree that JSON is better than XML for their "list" example,and that XML is better than JSON for their "UI layout" example. BUT:

> This means there’s no officially supported way to represent the list of movies in an element attribute. We can hack this by encoding the list into an attribute using a comma delimiter:

      <Users>
         <User
          first="Michael"
          last="Scott"
          favoriteMovies="Diehard, Threat Level Midnight" />
        
So, that's because you're doing it wrong. XML has NO PROBLEM with hieararchy, REALLY. Of COURSE there's a supported way to encode hieararchy.

      <Users>
        <User>
          <first>Michael</first>
          <last>Scott</last>
          <favoriteMovies>
            <title>Diehard</title>
            <title>Threat Level Midnight</title>
          </favoriteMovies>
        </User>


You should have read the next paragraph. They do use that as an example

    <Users>
      <User>
        <FirstName>Michael</FirstName>
        <LastName>Scott</LastName>
        <FavoriteMovies>
          <Movie>Diehard</Movie>
          <Movie>Threat Level Midnight</Movie>
        </FavoriteMovies>
      </User>


Fair! Why the heck did the author first try to tell us XML had no "officially supported way" to do this then?


Because it was about encoding a list of things in an _attribute_. Attribute! Using nested elements is not exactly a solution for that requirement.


And if you want, you can still keep some of the attributes:

      <Users>
        <User first="Michael" last="Scott">
          <favoriteMovies>
            <title>Diehard</title>
            <title>Threat Level Midnight</title>
          </favoriteMovies>
        </User>


If XML is better at representing trees, and JSON is a tree, does that mean that XML is better at representing JSON than JSON?


Yes.

Consider a comparison of examples using JSON schema:

  {
    "$schema": "http://json-schema.org/draft-04/schema#",
    "description": "Modified JSON Schema draft v4 that includes the optional '$ref' and 'format'",
    "definitions": {
        "schemaArray": {
            "type": "array",
            "minItems": 1,
            "items": { "$ref": "#" }
        }
    }
  ...


  <schema schema="http://json-schema.org/draft-04/schema#">
    <description>Modified JSON Schema draft v4 that includes the optional '$ref' and 'format'</description>
    <definitions>
        <schemaArray>
            <type>array</type>
            <minItems>1</minItem>
            <items ref="#" />
        </schemaArray>
  ...
$schema and $ref are differentiated by convention in JSON, and require repeated consideration in every parsing scenario to treat them as exceptions relative to actual values within the JSON document. In the XML representation, the distinction is implicit and handled automatically for you by every parser.


>$schema and $ref are differentiated by convention in JSON, and require repeated consideration in every parsing scenario to treat them as exceptions relative to actual values within the JSON document.

This is a good thing. The main issue with XML is how ridiculously overcomplicated and overengineered its design was. This isn't just a matter of usability, it has security implications because it increases the attack surface of anything that uses it (e.g. the XML billion laughs attack).


> This is a good thing. The main issue with XML is how ridiculously overcomplicated and overengineered its design was.

Citing element attributes as an example of such complexity is hardly reasonable. The JSON example above contains the same complexity, just moved to the application/implementation space, instead of the parser (so your app is more complex to handle it).

> e.g. the XML billion laughs attack

Also not a great example. This is a simple DOS attack, of which there are many other examples: zip bombs, yaml bombs, etc. None of these invalidate yaml nor zip themselves.

There are also far worse attacks on XML than the billion laughs attack (XXE attacks are an entire category in the OWASP top 10).

What these attacks have in common is that they use references. They aren't unique to XML because any format wanting to handle references (like JSON schema!) will have to account for them.

The difference is, since references aren't built into the JSON spec, you have to do it yourself (and protect against attack like this in your own code). Since XML handles this at spec. level, common XML parsers can account for & mitigate for this for you (which fyi, modern ones do).

---

Side note (and a vote in favour of JSON):

By your own metric, JSON is actually "more complex" (in a good way) than XML in one area: value types. XML values are strings. Having value types is one massive advantage of JSON imo.

In that sense, it would be nice to see a language that combines both of these "complexities", to form something better.


I would disagree about having more value types. JSON has more syntax for value types, but it only has four: string, integer, float, and boolean. XML has less syntax for value types (everything is serialized as a string), but it has a way to define types of attributes with XML Schema and there you get much more primitive types already (decimal, date & time, binary). XML approach is more uniform: it uses schema to look up all the parsing rules. JSON uses syntax for some types and (ad-hoc) schema-like logic for others.


>Citing element attributes as an example of such complexity is hardly reasonable. The JSON example above contains the same complexity, just moved to the application/implementation space

Where it belongs...

>so your app is more complex to handle it

There is a trade off between more simple markup and more complex code and vice versa. I would argue it is almost always better that way around because it helps enforce clear separation of concerns. I'm similarly allergic to putting complex logic in template code because that also violates a clear separation of concerns.

>Also not a great example. This is a simple DOS attack, of which there are many other examples: zip bombs, yaml bombs, etc.

This is precisely why it's a great example. YAML has the same problem with overcomplexity XML does. JSON has no such issues.

>There are also far worse attacks on XML than the billion laughs attack (XXE attacks are an entire category in the OWASP top 10).

Right, billion laughs is part of a class of security vulnerabilities that are enabled by XML's bloated design.

>The difference is, since references aren't built into the JSON spec, you have to do it yourself

Except you don't. I'm not sure if I've ever implemented references in any JSON schema I've ever used. It's an entirely pointless feature as far as I'm concerned. I've seen them used in YAML (where it's equally yucky) and every time it's been used it's been as a band aid over deficient schema design that also inadvertently made the markup harder to understand.

>By your own metric, JSON is actually "more complex" (in a good way) than XML in one area: value types. XML values are strings. Having value types is one massive advantage of JSON imo.

I hardly think having 4/5 scalar types counts as spec overcomplication. If you want a measure of how complex each markup language is simply look at the length of the respective specifications. XML is ridiculous.


This is reminiscent of the (whimsical) RFC 3252: Binary Lexical Octet Ad-hoc Transport

https://www.ietf.org/rfc/rfc3252.txt


> JSONx is an IBM® standard format to represent JSON as XML.

https://www.ibm.com/support/knowledgecenter/SS9H2Y_7.5.0/com...


Representing is easy task. You need to parse that representation. And XML here is worse than JSON for typical programming graphs. You can't express number or array with XML. XML Schema helps to add more structure, but JSON does not need that, array and numbers are built-in. And I would say, that XML Schema is too powerful and that's usually not needed by developers. JSON is just a fine medium, simple enough and powerful enough.

But when you deal with inherently hierarchical data, trees, XML is good.


> You can't express number or array with XML

Numbers can be expressed, they just have to be parsed on the client. For array, child elements are ordered and form an array naturally.


> Numbers can be expressed, they just have to be parsed on the client.

But client has to know that they're numbers. So you can't just parse XML into JavaScript object without any knowledge about its structure.

> For array, child elements are ordered and form an array naturally.

But you can't express array of 0-length this way without explicit knowledge about structure. And you can't distinguish array of 1-length from an ordinary value without explicit knowledge about structure.


> without explicit knowledge about structure

That's why XML Schema exists. So, everything you listed is possible in XML if you don't put aside some specifications on purpose.

And JSON is so self-describing that someone came with... JSON Schema.


Yes. And that's my point: you don't need schemas to work with JSON. JSON Schema exists, of course, it would be strange if someone did not invent it, as it's an obvious idea. But I've yet to see anyone using it.


> But client has to know that they're numbers. So you can't just parse XML into JavaScript object without any knowledge about its structure.

<Number value="123" type="i32" />


This is tag with name Number and with two attributes. Are you trying to invent XML sublanguage? It's possible. But with JSON it's already invented and there are multiple libraries for every programming language. What should I use to parse your format?


> But with JSON it's already invented

It's inventory and optimized for certain language types, specifically those that are loose with their numeric types. This includes most scripting languages.

In languages that require specific intrinsic types to be defined for a number, there are usually a lot to choose from and using the wrong one can be a real problem when converting from XML or JSON to a native format.

On a very simple level, what container do we use to encode the following data structures:

  [100,4,10,156]

  <array>
    <num>100</num>
    <num>4</num>
    <num>10</num>
    <num>156</num>
  </array>
In most dynamic languages, hut number type used is just the included numeric container, which generally includes some sort of complex decision between floating point and bignums. For something like C, Java or Rust, generally we would want to choose an appropriate type. In this case, it looks like an unsigned 8-bit integer will suffice for most, and a 16 bit signed value for Java (which doesn't support unsigned values).

But what if the next number is much larger? Should we really need to parse all the values to determine the correct data type to use? That seems very inefficient, and we can't even be sure that we'll encounter values that accurately illustrate the range of values in one parsing. What if the next message or file we parse has large values?

For these languages the inherent data type definition of JSON is a poor match, since its looseness does not transfer easily to a language which does not inherently support it. If your target languages supports dynamicaly resizing untyped arrays, untyped key-value maps, and generic number types that support both very big and floating number types automatically, then JSON is an almost perfect representation format for you. If your target language works best when those items are broken into smaller more explicit components, there's a lot of extra work in parsing JSON, and I can see how that makes XML not look much worse in comparison (especially since your parsed data structure will likely be leaner because there isn't the overhead inherent in those convenient magical types, references are rarely as efficient as pointers).


Odd that you would pick numbers. Those are not exactly but free in json. And dates are as likely to actually be a problem.


> And dates are as likely to actually be a problem.

I'm currently working against a REST API which JSON format has three different date formats as sub-elements of the same root object...


As opposed to json: `{"number": 12345678901234567890}`

What does that look like when you parse it?


Mapping from string to number. Probably BigInteger for Java, whatever other standard integer for other languages.


I stopped reading at the example describing how to provide an org chart relation as XML. The argument here failed for me, and made the notion of what is "clean" code or "clean" structure even more subjective than ever.

The JSON version of the employee org chart is much more semantic and human understandable (to me) as it "cleanly" describes the relationship for reports as a property called "reports" (for those who manage people - they have that property).

But the org chart as XML was anything but obvious to my human eyes, as it was just nested employee elements and at first glance, looked like a single list of employees. Visually interpreting it to know who's a manager, and who reports to whom seems far more difficult in the XML.

But, again, it's in the eyes of the beholder, and one person's clean solution is another person's confusing tangle of brackets.


My problem with the nested XML org chart is that relationships are described in the documentation instead of the code. The JSON version has more text but encodes more information. That's one less thing I need to write code for.


They both looked fine to me but human readability shouldn't be the primary goal of a data format. I think the example is bad for both formats because it conflates tree data structures and logical hierarchy. The memory model of the application consuming the data would likely look like that, but that doesn't mean that's how it should be stored/transmitted, that should be probably be flattened, with each employee having a "manager" property.


Yeah, that is a good point. It does seem to come up a lot in regards to data formats, that of how human readable they are. So I don't think it's without value, but shouldn't be what they are optimized for since it's usually software doing the reading most of the time, and not human eyes.


Who would've thunk that different tools excel at different tasks?

We need more articles like this, to remember us that we should consider what's best for the task at hand and not what's trendy.


Yup, use object notation for passing around objects, use a markup language for markup. It's almost like reading the names gives a hint as to what it was designed for and probably best at.

My sniff test is that if you're editing it by hand, JSON is a poor format because you'll want the benefit of a schema or at least a user-friendly format. If you're not, the syntax doesn't really matter so we should evaluate it on technical merits (verbosity, computational complexity for serialization/deserialization, memory footprint, etc).

I use:

- config file formats for config files (usually TOML or INI format) - data formats for data (JSON, protocol buffers, etc) - markup formats (Markdown, XML/HTML) or code for markup

I really don't get why people try to force all use cases onto the same format. Use whatever is well suited for the task, preferring familiarity over unnecessary technical benefits.


XML also has a LONG history of being used for UI in stuff like QT, which is great.

The one problem I see is that as developers, we shouldn't give a rat's ass about the otuput format's readability when compared to the API ergonomics. If the API for describing the UI is good enough (Can I do stuff like "App.Transition(App.findOne(Sidebar), 500, {width:"1%"}, {width:"30%"}, Ease.EASE_IN_OUT)" or is it a 300loc tutorial?) then the storage format at the end of the day is not THAT important, at least for me, a guy who does a lot of GUI work every day.


it isnt that important for me either, and imo shouldn’t be...BUT

when you are in a formal setting, people like to code review, and in that case, in the absence of some special diff tool, people will complain they can’t review your ui changes in github and suddenly your format matters alot more than before


Good thing we've had XML defining layouts for many years, despite many of the technologies that used the pattern being dead. XML as a layout definition vs XML as data were topics devs discussed heavily 15 years ago, too. So many Java frameworks used Xml to describe the user interfaces (like JavaServer Faces, early Tapestry, etc.). At one time, I worked with a lot was Adobe Flex which used MXML to define user interfaces [1], don't forget about Microsoft's XAML.

It's actually amazing to me how many of Flex's great ideas live on in other frameworks not dependent on the Flash Player, and looking at some modern day React often reminds me if it.

[1] http://flex.apache.org/doc-getstarted.html


Good article, but for some reason he doesn't mention one the best XML UI systems: WPF.



Having worked with SOAP/XML extensively at my past job I can say for web services I much prefer REST/JSON. JSON is much easier to work with and key value pairs make it much easier to get what you need. Parsing an XML tree can become a nightmare very quickly.

> However, we’ve created another problem in the form of inconsistency: some user properties are represented as element attributes, others as child elements.

Exactly. I’ve seen XML with element attributes and child elements all over the pace. No rhyme or reason for any of it. It’s especially bad in older systems.

I pushed hard at my last job for the SOA team to build REST/JSON web services. Oracle sold them a product to bridge the gap and they got End Of Life announcements 6 months later! Glad I’m gone.


1. Why is parsing a tree which is encoded in JSON easier? 2. Why does it matter if an element is in an attribute or in child elements. You need some kind of schema anyway, right?


It depends on the API of the system. I’ve worked on middleware that would not be able to get an attribute, it lacked the capability. Other systems used JavaScript and it was much easier with JSON

See this link on stack exchange[1].

[1] https://stackoverflow.com/questions/17604071/parse-xml-using...

> //Gets Street name xmlDoc.getElementsByTagName("street")[0].childNodes[0].nodeValue;

I’d much prefer something like jsonObj.address[0].street;. Personally I like to work with objects over parsing documents trees.


The author makes a case for why it would matter:

> XML, on the other hand, optimizes for document tree structures, by cleanly separating node data (attributes) from child data (elements).

Unfortunately, there appear to be some implementation issues here. You have to create a string version of your data to store them in attributes.

So, you lose a bit of context in the conversion.

Such as:

`<Document text="true" />`

Is the text attribute the word _true_ or a boolean _true_. Without referencing some other piece of code or definition there is no way to know.

Whereas in JSON, this wouldn't be necessary, as you can simply remove the quotes and infer that it is not a string, but a boolean.


When they did work, WSDLs were pretty slick.

Technically, there are REST/JSON equivalents, but I don't see them actually used much in the wild - so much of the time with REST APIs you are stuck with whatever half-baked, out-of-date documentation that somebody may or may not have written, or just randomly poking at it through trial and error to discover how it works.


Agreed WSDLs are the only thing I love about SOAP and they are missing from REST. Most tools can import a WSDL and generate the code and structure for you.

Sometimes even getting a sample of the expected JSON format can be hard. If you’re not versioning you’re APIs you can break everything if the schema changes.

I used to debate this with the SOA manager. Swagger looks ok but never dealt with it in production or saw it in the wild with our customers.


I still work with SOAP occasionally. I'd never recommend anyone start a new project with it, but I've never had to manually parse anything. Did the framework you had to work with not provide decent XSD/Class code generation and serialization/deserialization tools?


Yes and no. I worked with several middleware systems, a few disparate systems, and a couple very old systems. Right now I’m working on a few web service integrations with SerivceNow for Microsoft Teams, ticketing, and CMDB stuff, it has decent tools.

When you are hacking systems together for large/old enterprises the water gets muddy fast.


One thing I really miss when working on JSON vs XML data are comments. A workaround is to make the comment a valid string in the data, but still not as good as having the ability to comment an arbitrary line.


XML has comments. They are the same as HTML comments. <!-- comment goes here -->


That was indeed my point - whereas one can't in JSON.


I must have read your post three times and still thought you said "and XML". I'm sorry.


What do JSON comments look like?


The main reason that I use XML (occasionally) is because of XML Schema. It's a very precise data description that can be semantically verified.

Otherwise, I try to use JSON, where possible, because I am not a masochist.


JSON Schema is pretty useful and is well supported by tools like VS Code


I actually found JSON Schema to be FAR FAR FAR easier to use than XML Schema.

https://json-schema.org/


You are correct.

I LOATHE XML Schema. I have used it for years, and have never memorized it. I still need to look it up like a n00b, every time I use it.

But the simple fact of the matter is, is that so many people and projects have inculcated it into hundreds (if not thousands) of toolsets and specs (Like -Ick- WSDL), that it really is the only viable game.

People tend to like JSON precisely because it eschews the kind of overhead that is the definition of XML Schema.


Huh. I just realized that JSON Schema is STILL not a published spec. It is currently in draft, and has been, for a long time.


Does https://json-schema.org/ not work for you?


It would, if anyone used it.

The thing about XML Schema, is that it is completely "baked into" pretty much every tool and library out there, so you can do "on-the-fly" runtime validation.

JSON is really about minimizing text. It works really well for this.

Another issue, is that JSON doesn't have comments, so it can be difficult to annotate a JSON file.


I don't know JSON schema really well; Can you say you want a `price` value to be a non-negative decimal value with at most two digits after the comma, and you want the `currency` value be exactly one of the following : "EUR", "USD", "GBP", the default if not specified being "EUR"?


Yes:

    {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
            "price": {
                "type": "number",
                "multipleOf": 0.01,
                "minimum": 0
            },
            "currency": {
                "type": "string",
                "enum": ["EUR", "USD", "GBP"],
                "default": "EUR"
            }
        }
    }


I've seen a lot of XML vs. JSON discussion that misses the point why developers like me love JSON so much, at least until things get really big, or you want to add another 9 to your reliability [1].

JSON is so easy to get started with. Some of this is precisely because of the lack of schema.

In Java with the GSON library, encoding/decoding is one function call and for the things I've been doing this just works at least 90% of the time. I've managed to add a JSON option to several endpoints in an application in something like 1-2 hours, including writing tests.

Back when I did Java as a full-time job and we were working with XML a lot, there were whole proccesses and workflows and validations and annotations to get the thing working properly. I think you also had to add a few extra stages to the maven compile process back then to get the schemas and all properly set up and loaded. It's not so much a fault of XML as a format, as of the whole "enterprise" tooling that we were asked to use around it. I really hope that this has got easier now.

It's almost as if the XML library people were thinking in the waterfall model where you specify everything up front, and the GSON people were thinking in the agile model.

[1] https://rachelbythebay.com/w/2019/07/21/reliability/


Using GSON is great if you're encoding/decoding straightforward data structures. However if you're using it for serialization for more complicated objects (especially with polymorphism!) it gets zany really quickly.


Author here. The meta-point of my post is that different formats have their advantages and disadvantages, and it's good to understand those when deciding on what's appropriate for a given use case. So I'm glad to see a discussion weighing the pros and cons in the thread!

I see some comments saying that JSON can represent trees just as well as XML, which is technically true. However, XML natively supports a distinction between node metadata (via properties) and relationships (via child elements) that has to be implied in JSON. For pure-data, this may not matter . But in cases like UI layouts, it's helpful to have the format distinguish between node properties and child components. As others have mentioned, the document use case is also a natural fit for XML by clearly separating the content from markup.


I'd expand on his premise by saying that XML beats JSON for structured documents, which also happens to be a practical way to represent a UI layout.


Seems XML is having a revival on HN so I'll just repost my XML editor:

I would love to know what people think of my XML node-graph/tree editor I made before JSON became mainstream (my excuse): http://rupy.se/logic.jar

- You link/unlink nodes (I called them entities! Xo) by right-click-dragging between them.

- You copy stuff by right-click-dragging to an empty space.

- You delete by grabbing something by left-click-holding and pressing the delete key.

- Oh, and nodes are completely tree structure expandable, just drag-drop attributes on nodes and nodes inside nodes.

(I know, not super intuitive; but very handy once you know about these.)

The editor uses lightweight rendering so you can have a ton of elements with good performance.


No one should really be writing UI code by modifying a file by hand because UIs are often complex and deep, so the file structure used to store the UI data doesn't need to be human-readable.

This is true for any data - once you hit a moderate level of complexity neither JSON or XML really work - you need to build a tool to manipulate the underlying data rather than changing it directly. If you do that then it doesn't matter how readable or 'ugly' the data is because a human being shouldn't need to access it directly. This leads to other benefits you can optimize for like size or parsing speed or redundancy instead.


> No one should really be writing UI code by modifying a file by hand because UIs are often complex and deep, so the file structure used to store the UI data doesn't need to be human-readable.

I strongly disagree. SwiftUI is fucking amazing.

https://developer.apple.com/tutorials/swiftui/

Even with your other comment about "until you hit a moderate level of complexity." People are already creating some complex things with it. In fact, they can be more complex than what you can achieve with traditional frameworks when you consider the time it takes; what takes me a day in UIKit etc. takes me a minute in SwiftUI, so I can quickly move on to the next bit.

With declarative UI code I could literally have a usable app in the same time it takes me to struggle with getting separate visual designers + logic editors to agree over a single moderately complex component.

Though, it is several layers of abstraction over what really happens behind the scenes, so I guess a part of your point stands.


The other replies cover most of my reaction, but there’s one more thing I want to add:

If you have a binary format for a tree structure, you almost always end up accruing garbage as you edit it. That always happens in rich HTML editors, for example.

For something like an image editor, the bitmap is OK because it’s not a tree structure and doesn’t balloon out of control as you edit.

Where your image is tree-structured, eg Photoshop layers, the editing tool will generally show you the exact layer tree in a sidebar, so you can make sure it stays clean and tidy.

It’s easiest to keep it tidy when the canonical source is human-readable text. (Although it’s worth watching out for formatting differences. Something like gofmt helps there.)


Following what you say we should only have binary format.

I think you underestimate the convenience of a text format.

It does not only provide a simple way to generate it, but also provide a way to modify it by hand.


And have readable and understandable diffs!

One of the biggest issues with the most popular 'tool-only language' (Excel) is that there is no reasonable way to audit changes. Same goes for UIs that are backed by inscrutable file formats.


Readable and understandable diffs is not a property of text files. It is very easy to design tools and formats that use text files, yet do not give you diffs that are readable or understandable.

Readability and understandability depend on more strict requirements that a text file format may or may not have. Many do, but far from all do.

For instance, JSON files do not have requirements on the ordering of keys. When making a small change to a JSON file, it is entirely legal to reorder every single key, completely destroying the diff, and an automatic tool may very well do this.

And as a counter-example, we could easily use a binary representation of JSON data, but use a diffing tool that normalises key ordering to provide diffs that are more readable and understandable than a text diff of a JSON file, even without malicious reordering of lines.


But this means you need diff tools, version control systems and editors (edit: and merge tools) for every single format, which while nice in theory (hey plugins!), in practice doesn't work well. Text is nice because it sort of a minimum common denominator.

Of course semantic diffs would be nice; even, or especially, for code for example but while they do exist they haven't seen much uptake because they do not integrate well with existing tooling.


And it's also easier to debug human readable text then it is to debug raw binary. Text is such a good abstraction layer that we almost never have to deal with character encodings, etc. Only problem is that it has to be parsed twice, first from binary to text, then from text to whatever. But computers are good at such tasks, while humans are not.


Following what you say we should only have binary format.

Only if you ignore the bit where I said "once you hit a moderate level of complexity". Text file formats work really well for simple things.


Yeah I had to hand edit some IOS ui stuff when a version diff caused problems. human readibility even if difficult is beneficial. Useful for generating diffs as well


so we shouldn't write HTML by hand, but instead use WYSIWYG editors?


HTML isn't a UI description. It's a document description. HTML + CSS + [browser behavior|Javascript] is the UI description, and arguably some more complex web apps are getting to the point were they've gone past the point where HTML, CSS, and JS are the best formats for describing the way they work.


They never were the best way. Other application GUI systems are less painful because they were designed for that from the beginning. HTML remains OK for documents, though.


> No one should really be writing UI code by modifying a file by hand because UIs are often complex and deep

Yet people do this everyday. So it's not that complex.


This reminds me of the line of thought from the React community. Make everything as complex as possible (vdoms, etc.). But there are other ways, frameworks, styles to achieve what people want without that level of complexity. A "moderate" level of complexity absolutely does not require special tooling to manipulate underlying data.


Another reason why XML might be better for UI Layouts (in addition to typed nodes) is that it allows mixed content, making it more concise in the case where marked up text is included.


While not really JSON, HyperScript[1] is much easier to grok than XML to me. It allows you to define layouts really naturally using standard JS objects. Mithril[2] specifically implements it really well.

1 - https://github.com/hyperhype/hyperscript

2 - https://mithril.js.org/index.html#dom-elements


> XML, really? It’s bloated and outdated. Why not use JSON? It’s the future.

Who says this? XML is a great general markup language that every one all ready knows, it's a perfectly valid choice for most things, please use it.

Just don't make yet another bloated outdated markup language like YAML, any "optimisations" turn out to be opinion that doesn't justify the time spent learning a new language.

Do you really want to be known as the guy that made YAML? No you don't.


Reminds me of jasonette (https://jasonette.com/) ...

Whether it is JSON or XML, what I particularly like about both approaches is that both are fundamentally HATEOAS (https://en.wikipedia.org/wiki/HATEOAS)


Implementations of layout in source code tend to suffer hard-to-diff from a source control point of view.

Everytime you move an existing component into a wrapper, the identation breaks how the diff shows up.

Perhaps we need a XML formatter that instead of pretty formats, sets the indentation to 0, when we commit code changes. However, that makes the source hard to read.

So instead of a diff that reads line

   - <panel>
   -  <textblock>foo</textblock>
   - </panel>
   + <vpanel>
   +   <panel>
   +    <textblock>foo</textblock>
   +   </panel>
   + </vpanel>
We get

   + <vpanel>
     <panel>
     <textblock>foo</textblock>
     </panel>
   + </vpanel>
   
Another problem is with the end tags, it is difficult as in the previous example to tell which opening tag an end tag is for.

A pretty formatter could solve this problem too.

    <vpanel id="vertical-wrapper">
    <panel id="heading">
    <textblock>foo</textblock>
    </panel><!-- heading -->
    </vpanel><!-- vertical-wrapper -->
Perhaps the level of indentation could be added visually too

    <vpanel id="vertical-wrapper">   <!-- vertical-wrapper  -->
    <panel id="heading">             <!--   heading         -->
    <textblock>foo</textblock>       <!--     textblock     -->
    </panel>                         <!--                   -->
    </vpanel>                        <!--                   -->
but then we bring back the problem of the ugly diffs, gah!


You can pass -b to git diff to get exactly your first result. I would lean toward editor tooling for the latter case, myself.


This is just a simple case of serialization. Having worked on 10+ game engines and other scene graph based applications I would prefer a more flexible reference based approach.

Instead of hard coding your hierarchy into the structure of XML or JSON. Define some basic type information (unique id and node type).

If you have situations where an object can be referenced before it is defined simply make a proxy loading type that listens for it to load.

After that it is a cakewalk:

[

{

  id: "UUID OR Rolling Number Per File OR DB Index OR ...",

  type: "business",

  name: "Dunder Mifflin Paper Company, Inc."

  description: "look at me I can be a root type for the graph OR one of many!"

  offices: [

   list of ids of objects of type "office" or object that wraps this with other metadata (relation object)

  ]

 },

 ... potentially other business entries,

 {
  id: see above...
  type: "office",
  name: "Scranton Branch",
  description: "This can describe each branch"
  departments: [
   list of ids of objects of type "department" or object that wraps this with other metadata
  ]
 },
 ... other offices,
 {
  id: see above...,
  type: "department",
  name: "sales",
  members: [
   list of ids of objects of type "employee" or object that wraps this with other metadata
  ]
 },
 ... other departments,
 {
  id: see above...,
  type: "employee",
  title: "manager",
  name: "Michael Scott",
  reports: [
   list of ids of objects of type "employee" or object that wraps this with other metadata
  ]
 },
 {
  id: see above...,
  type: "employee",
  title: "sales man",
  name: "Dwight Schrute",
  reports: [
   id of self since Dwight needs someone he can trust,
   {
    id: see above...,
    type: "report_meta",
    reportee: id of Pam,
    visibility: "secret"
   }
  ]
 },
 ... other employees,
 etc.,
 etc.
]


The JSON examples are not even valid, the "$reports" properties on department are written as comma separated lists without enclosing array brackets.

I guess XML might be perfect for when you can't decide between singular and plural and want to leave the problem to others...

Because that's the biggest problem besides charset encoding that JSON helps solving: in JSON, you can be unmistakably clear that a given property is to be interpreted as a list of values, even when the current number is less than two. In XML you can just vaguely communicate that intent with an intermediary "plural element" unless you go full schema.

  <A><Bs><B/><Bs/><A/>
isn't half as clear as

  {"bs":[{}]}
(attributes unrelated to this difference omitted)


If you are going to make the case for one being better than the other then show the same data in 2 different formats side by side.

The article only shows 1 example in 1 format and then moves onto another point.

The reader is left to imagine what the difference is in their head.


I really like the idea of a constrained language for UI declarations, like the angularJS/vue templates and XML instead of UI in code. It makes generating test code for the UIs tractable.

Here's my take on what's possible for generating support code for UI tests when the UI is written in a declarative, easily-parseable language: https://samsieber.tech/posts/2019/06/type-safe-e2e-testing-d...


JSON is a data structure. XML is a text markup. Don't mix them up.


XML is also a transform specification with XSLT (https://en.wikipedia.org/wiki/XSLT).

XML is also a query language with XQuery (https://en.wikipedia.org/wiki/XQuery).

XML is also a nested structure path specifier with XPath (https://en.wikipedia.org/wiki/XPath).

XML is also a file format with SVG (https://en.wikipedia.org/wiki/Scalable_Vector_Graphics), OpenOffice XML (https://en.wikipedia.org/wiki/OpenOffice.org_XML), OpenDocument (https://en.wikipedia.org/wiki/OpenDocument), ePUB (https://en.wikipedia.org/wiki/EPUB), DocBook (https://en.wikipedia.org/wiki/DocBook), and many others (https://en.wikipedia.org/wiki/List_of_XML_markup_languages).

XML is also an data/object serialization format with libraries like Jackson (https://github.com/FasterXML/jackson), YAXLib (https://github.com/sinairv/YAXLib), Boost (https://www.boost.org/doc/libs/1_38_0/libs/serialization/exa...), Pyxser (https://github.com/dmw/pyxser), and many others.

XML may not be as elegant as JSON, but it has a wide variety of uses and wide adoption across many programming languages (libraries), and disciplines.


I was going to make the same post, upvoted you instead!

What's nice about JSON is that there's very little ambiguity in how to serialize / deserialize data structures. That's what's so powerful about it. As a result, using "magic" serializers with language-native data structures often works well and has little gotchas.

In general, I think a lot of people hopped onto the XML bandwagon assuming that it would always sanely translate back and forth between XML and native data structures. This is not the case. When XML is the (cough) "best" format, the underlying data structures of different programs operating on the XML will look extremely different. Furthermore, XML terminology will end up leaking into business logic, because the semantics of the document are inseparable from handling it.

In general, I think the biggest sign that XML is the wrong choice is when someone's using a "magic" serializer, or when the data structures very closely follow the document structure. In those cases, JSON was probably a better choice.


I have yet to read anything that convinces me that XML is good for anything. Just because XML in certain cases is less bad than some other cherry-picked technology, doesn't mean that there aren't other better options.

See also Erik Naggum's legendary rant: https://www.schnada.de/grapt/eriknaggum-xmlrant.html


Well, for instance, let's take the HTML code of your comment (which happens to be a valid XML document):

    <span class="commtext c00">I have yet to read anything that convinces me that XML is good for <i>anything</i>. Just because XML in certain cases is less bad than some other cherry-picked technology, doesn&#x27;t mean that there aren&#x27;t other better options.<p>See also  Erik Naggum&#x27;s legendary rant: <a href="https:&#x2F;&#x2F;www.schnada.de&#x2F;grapt&#x2F;eriknaggum-xmlrant.html" rel="nofollow">https:&#x2F;&#x2F;www.schnada.de&#x2F;grapt&#x2F;eriknaggum-xmlrant.html</a></span>
What would be the JSON/YAML/TOML equivalent?


In order for me to answer that, you must be more specific about the purpose of the representation.

For the purpose of authoring comments on Hacker News (HN), I for one am grateful= that HN doesn't make us type out HTML. I also doubt very much that HN stores our comments in a different representation (e.g. HTML) than the one they were authored in.


That's a very good point. Thing is, markdown (or a subset of it, as on HN) is very limited. Great for short comments, bad when you want to do something a little more complex, like a blog post (I always end up addding some HTML here and there in my markdown blog posts, when I want to include a video for instance).

The other problem is that HN comments and all those markdown-like formats are very ad hoc. If I copy/paste my reddit comments on HN, or the other way around, it won't always work as expected. In both cases, it has to be translated to HTML to be displayed by the browser, too.

So, the context would be: an export format for complex, multi-paragraphs textual documents with meta-information.


So, the context would be: an export format for complex, multi-paragraphs textual documents with meta-information.

If you add the condition that it be human-readable, I admit I am not able to point to an existing format that is obviously better suited for this than XML.


The HTML doesn't represent the classes properly. The LaTeX would take a simpler approach:

    \begin{spanned}
       I have yet to read \textit{anything}...
    \end{spanned}
Since there's no Javascript, probably no need for the class variables. And, of course, you can just write your own macros as you need them.


Side note: You've overdone the character entities a bit more than necessary.


I know, not my fault, I just copy/pasted the actual source code of the page while I was commenting.


Before reading this please note that I am not advocating for or against XML.

A well reasoned XML instance can replace HTML for most things from presentation, accessibility, interaction, content negotiation, and so forth. You can apply JavaScript and CSS immediately to an XML instance. You don't get any of that capability with JSON, YAML, or anything similar.

I really think the people who hate XML the most are those are those who lack the imagination to move beyond raw data and query relationships. I agree XML isn't good at this, which is why it completely offloads that capability to a sibling technology: DOM.

I have also never seen anybody with a comfortable understanding of the DOM believe JSON, YAML, and the like fill that void, but then these other technologies do not primarily exist to structure human consumable content directly for human consumption. The most important part of the article you linked to describes markup relevance as a percentage of syntax overhead. The more the described content becomes an extraction of computer oriented data the more that percentage goes up thus making XML progressively a bad decision, but the opposite is also true.

The article really nails its bias in this regard by fixating on data as a facet opposed to information as a structure. When the goal is to provide primarily a structure, as opposed to fundamentally offering a syntax, the cost to scale grows inversely to the quantity of content provided. That is largely thanks to lexical scope, which allows a richer interpretation of context without any additional or specified syntax. That is the nature of information versus data. Conversely, data conveyance schemes that exist to primarily offer a syntax scale proportionally to the content provided because the ratio of syntax to content is static without any additional meaning.

For some clarity on the difference between data, information, and knowledge I suggest the DIKW model: https://en.wikipedia.org/wiki/DIKW_pyramid


How about simple DSL layouts?


Do you mean a simple DSL for layouts?


Android UI is xml based and it works relatively well there


From their documentation:

> HXML does not use CSS.

I wonder why? XML is perfectly stylable with CSS. Why not let the user reuse their knowledge?

Somewhat canoncial: https://www.w3.org/Style/styling-XML.en.html

Also: https://duckduckgo.com/?q=xml+css


If it's purely a choice between XML and JSON, I'd agree that XML is a bit better. I'd prefer to use neither.

Most of these UI layout formats are essentially complex domain specific languages, paired with IDE support. I don't see any particular need to base them on XML, which was intended for marking up documents.


The XML example seemed odd to me as written. In the JSON case, yes there was a prefixed key "$reports", but it told me what the relationship was. Looking at the XML, what does it mean for an Employee to be nested inside another Employee?


Sounds like the list of employees should have been nested in a <reports> and not left directly inside the employee.


Where Tk beats everything else. "pack" and "grid" are still way ahead of the established layout stuff. And Tcl is the perfect language for Tk. Sadly people are put off by its seemingly strange syntax and semantics...


This doesn't seem to solve any problem related to UI. Because tree can be converted to a list and vice versa. The added advantage is tree structure is natural.

But the way article is put forward is as if using json we cannot achieve tree structure


All this talk about representing UI elements in either JSON, XML and / or code but what it’s state? How the hell is neatly encapsulated next to the UI in XML and JSON. Dynamic URLs? Dynamic Ids, dynamic image urls?


XML is old hat. We should obviously be defining configurations using JSX now.


XML vs JSON (vs code) devolves into bike-shedding because it positions the discussion in the wrong place. Developing layout in plaintext, rather than a WYSIWYG editor, is a priori impedance-mismatched.


I find JSON better for random lookups, and XML good for sequential reads. UI falls into the later basket: start at the root, read the children, etc, etc.


XML also supports id references, which are built into the parser - to my knowledge, there's no JSON equivalent (nor, do I think, could there be).


No mention of XAML.


I am not familiar with Xcode: is there an easy way to get the exploded 3D view of DOM in a browser? Maybe a browser extension?


Firefox used to do this, but looks like they haven't for a while. Guess it's been longer than I thought since I used it.

https://developer.mozilla.org/en-US/docs/Tools/3D_View


Yes, XML flavor beats JSON in UI layouts. We call that flavor HTML.


Everybody knows that Yaml is the one true file format!


The ability to extend an existing element with attributes is super useful in XML. To do an equivalent in JSON is quite ugly.


XML in most cases is a pain in the ass to deal with relative to JSON and it isn't new or the rage anymore. A lot of hot technologies initially have this in common. They're so cool and flashy and new that everyone's willing to overlook the actual mechanics of working with them -- for a while. JSON is just more convenient to work with 90-95% of the time, so everyone ends up preferring it and so it tends to win out.


One thing that's contrary to how XML is historically used but helps a lot.. do not use child elements where an attribute suffices:

eg, don't do this, from the article:

  <Users>
    <User>
      <FirstName>Michael</FirstName>
      <LastName>Scott</LastName>
      <FavoriteMovies>
        <Movie>Diehard</Movie>
        <Movie>Threat Level Midnight</Movie>
      </FavoriteMovies>
    </User>
    ...
  </Users>

but write it like this

  <Users>
    <User FirstName="Michael" LastName="Scott">
      <FavoriteMovies>
        <Movie Title="Diehard" />
        <Movie Title="Threat Level Midnight" />
      </FavoriteMovies>
    </User>
    ...
  </Users>  
Much easier to query for a specific element (eg, try to XPath or CSS Selector query for all of Michael Scott's favourite movies with the first syntax), and the schema is a lot easier to write and reason about

(I don't have to wonder if multiple LastNames might be allowed, if I can set a xml:lang attribute on any of them, if the ordering of FirstName and Lastname elements is constrained, etc etc)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: