Why is JSON so popular? Developers want out of the syntax business

frognibble · on March 19, 2010

JSON is popular because there is a straightforward mapping from JSON to native types in many programming languages.

andreyf · on March 19, 2010

More fundamentally, JSON embodies some fundamental metaphors we use in data types - numbers, strings, lists, and maps. Those are a lot more intuitive/familiar than whatever metaphors XML comes up with.

stcredzero · on March 19, 2010

Language designers take note: there is tremendous utility in making entities in your language isomorphic to entities expressed in previous languages.

I am actually using such isomorphisms in my current porting project. I'm literally getting a couple of orders of magnitude more productivity out of this over porting by hand.

Also note: less syntax makes this easier.

sketerpot · on March 19, 2010

I recently wrote a program to do Threefish encryption which I deliberately wrote to look as much like the mathematical notation in the spec as possible. That made things so much easier that it was almost ridiculous, even if it made the resulting program very peculiar-looking.

blasdel · on March 20, 2010

You'll love what Alan Kay has been working on: http://vpri.org/pdf/tr2007008_steps.pdf

You'll be especially enamored of their 200-line TCP/IP implementation, introduced on p17 and reproduced with documentation starting on p44. It's implemented as a grammar in their meta-language that parses the ASCII-art diagrams in the RFCs and executes them.

DanielBMarkham · on March 19, 2010

Observing the industry over the last couple of decades, I'm left with the feeling that sometime in the last ten years or so we all went XML crazy. The popularity of JSON is just the pendulum starting to swing the other way.

So count me in. I'll use JSON over XML for server-webclient data exchange whenever I can. It's just much easier.

andrewvc · on March 19, 2010

The funny thing is, we all knew we were going XML crazy while it was happening. I can remember anti-XML bloat articles going way back. The thing with XML was, it DID solve some problems, it just tried too hard.

The good thing about XML was it was a standard interchange format that every language had at least a couple parsers for. That said, I'm not sure if it was worth the annoyance and irritation of using stuff like XML Schema, DTDs, SOAP, or XSLT.

anamax · on March 19, 2010

As one of the earlier commenters pointed out, lisp welcomes you to the 1960s.

patio11 · on March 19, 2010

Sorry we were late to the party, guys, we just got so wrapped up in writing software that people actually use.

cema · on March 19, 2010

You are using some lisp software right now.

jrockway · on March 19, 2010

The Lisp programmers were writing airline reservation systems... the other programmers were writing board-game generators.

tlack · on March 20, 2010

Ah, that one example of a successful Lisp project.

pilif · on March 19, 2010

... while the lispers where still counting the closing parens needed to finally get their code running

SCNR

drunkpotato · on March 19, 2010

This thread has been yet another fruitful contribution to the language wars.

There seems to be something inherent in us that feels the need to turn our tools into our religion. It's not terribly productive.

pilif · on March 19, 2010

no. but fun.

SCNR is an old usenet acronym and stands for "sorry. could not resist". I was a) being sarcastic and b) just making fun for the list crowd. I totally agree that if the parens issue is a non-issue for you and you feel comfortable and productive writing lisp code than go ahead.

In fact, I have the utmost respect for people able to wrap their heads around that syntax.

stcredzero · on March 19, 2010

If you really think "wrapping your head around" the almost non-existent syntax of Lisp is a reason to respect someone, then I suggest you might delve a little further in your CS/programming studies and find many more wonderful and interesting things.

For the true rockstar coder "wrapping your head around" any reasonable syntax should be a triviality.

ewjordan · on March 19, 2010

For the true rockstar coder "wrapping your head around" any reasonable syntax should be a triviality.

Never had the dubious pleasure of trying to decipher another person's APL code, eh?

Then again, you did say "reasonable"...

stcredzero · on March 19, 2010

You don't have to tell me about APL syntax. For our senior year CS project, we implemented APL with unlimited precision arithmetic.

Wrapping your head around APL wins my respect. Lisp? You get a pat on the head for that, maybe.

stcredzero · on March 19, 2010

Only the ones too clueless to use Emacs or another editor that does it for you.

bphogan · on March 19, 2010

Maybe it goes back to my frustrations with Dreamweaver, but I always felt like an IDE only served two purposes - to prevent developers from really learning a language (autocompletes, intellisense, etc) and making up for poorly designed, overly-verbose syntaxes.

Here we are in 2010 and most programming languages still contain tons of semicolons, parans, and curly braces, and the only reasons I can find for this is:

* Language y looks like langage x because everyone's comfortable with language x's syntax already

which is contradicted by * Having trouble with curlybraces and semicolons? Your IDE will do that for you.

We need IDEs to deal with ugly syntax, and we have ugly syntax cos we've always had ugly syntax, but we then use IDEs to write the ugly syntax.

And then I hear people deride VB, Ruby, and Python for it's lack of curly braces and parens.

So, I don't think IDEs are the answer. I think better syntaxes are the answer. For example, I used Textmate's easy HTML completion shortcuts for a long time. Then I found HAML, and I didn't need to close tags anymore. I use Sass, and now my CSS doesn't need ; or {} anymore. Same with CoffeeScript.

C#, Java, Lisp, Perl, JavaScript are all great powerful languages, but I just get really uncomfortable hearing people defend the syntax with "just use your IDE". We're programmers. How come we're not interested in saying "hey, let's fix these things!" If the argument is "IDEs make you more productive in language X" then doesn't that say something about language x?

Just some thoughts.

stcredzero · on March 19, 2010

Well, that's quite an extrapolation from my recommendation to use automation to match brackets/parens.

Python is almost there in terms of language simplicity, but I'd much rather count parens than count spaces/tabs. The former are visible. The latter are only visible through secondary effect.

Any tool can be abused. Not using automation for truly mindless tasks like counting parens is often just a knee-jerk reaction that precludes real thinking about the cost/risk/benefit of the tool.

Programming has always been full of idiots who turn off their brains. Maybe modern tools make this a bit too easy. I'd posit that one way to detect programmers with such proclivities is to discuss tools with them, and see if they give you back unsupported prejudice or reasoned analysis. (Really be wary of the ones who try to pawn off prejudice as reasoned analysis!)

amethyst · on March 19, 2010

For some reason, every time I let an editor automatically insert a closing parenthesis, quotation, or brace, I almost always end up with extras at the end, which is just as frustrating, if not harder to debug, as not enough. Maybe it's the way I tend to backstep as I think my way through the code flow, but it never ceases to happen whenever I step foot into Eclipse or Visual Studio...

stcredzero · on March 19, 2010

For some reason, every time I let an editor automatically insert a closing parenthesis, quotation, or brace, I almost always end up with extras at the end

User issue here. I never have the problem you refer to. Don't have the editor/IDE insert the parens/brackets. Have it check them for you. Emacs can do a momentary highlight of the opening paren/bracket whenever you type the close of the pair. I used to use that to make sure I'm writing exactly what I thought I was writing. If my code is too convoluted for me to know that I'm correct in a split second when seeing the open paren highlight, then I know it's time for me to refactor/rewrite the method.

In my present project, I use a similar facility in the Smalltalk browser.

Hexstream · on March 19, 2010

"Don't have the editor/IDE insert the parens/brackets. Have it check them for you. Emacs can do a momentary highlight of the opening paren/bracket whenever you type the close of the pair."

NO. Sorry but you're doing it wrong. Relying on the editor to highlight parentheses is a highly inefficient and error-prone way to work. It's distracting, it wastes mental cycles and inhibits flow.

The correct way is to configure emacs so that when you press a certain key (in my case, right shift because I configured my keyboard so I can activate left shift with my thumb), emacs inserts "()" and puts the cursor in the parentheses. No need to remember or count closing parentheses this way.

It's been a really long while since I last counted parentheses for ANYTHING. Like, let's say I write a let with one variable, and the value is a really deeply nested expression and now I'm done with it and I want to write the body of the let. What do I do? I DON'T COUNT PARENTHESES. I go back to the opening parenthese for the variable declarations of the let, then I skip over that form (the variable declarations) with C-M-f (forward-sexp), right to the body of the let. Then I use C-j (newline-and-indent) and then I can write the body of the let. (if the value form is really long I might skip a few parentheses and use C-M-b to get back at or near the opening parenthese of the variable declarations).

So, executive summary, you walk the structure but you NEVER count parentheses. Like, once every four months I might screw up and delete a parenthese and then the editor and me are both pretty confused, but it's such a total non-issue. And even then, I don't count parentheses a lot because I try to walk the structure by skipping over forms and see what works and what doesn't, and I might tell emacs to reindent some of the code and I immediately see what's wrong. With a proper structure editor, unbalancing the parentheses couldn't even happen.

stcredzero · on March 19, 2010

NO. Sorry but you're doing it wrong. Relying on the editor to highlight parentheses is a highly inefficient and error-prone way to work. It's distracting, it wastes mental cycles and inhibits flow.

The correct way is to configure emacs so that when you press a certain key (in my case, right shift because I configured my keyboard so I can activate left shift with my thumb), emacs inserts "()" and puts the cursor in the parentheses. No need to remember or count closing parentheses this way.

No, sorry you are doing it "wrong!" :) I also use the insert () technique. I basically use whichever technique is optimal in context. Please note that your assertions about flow and distractions are highly subjective.

http://www.cyberbore.com/puzzle/klok.html

Incidentally, I no longer use emacs, and I don't write Lisp.

Hexstream · on March 19, 2010

If there's backing to the assertion that I'm doing it wrong in there, I didn't find it.

Incidentally, I use emacs and Lisp every day.

stcredzero · on March 21, 2010

Yes, so your coding context is very different from mine. I don't think you can make a very cogent claim of "wrongness."

stateofmind · on March 19, 2010

Put differently: if the goal is making money, lisp is frequently not the best choice.

cwp · on March 19, 2010

Yeah, sure. S-expresssions are literals in Lisp, and can be "parsed" calling eval. But I think that misses the larger point made by the OP. JSON is really useful even in languages other than Javascript. JSON wins over XML because XML is too complex and thus ambiguous. But JSON wins over s-expressions too, because s-expressions are too simple. How do you represent John Smith as an s-expression? Sure, it can be done. But as with XML, software that consumes s-expressions has to know how to interpret their structure, even if it doesn't have to parse the surface syntax.

weichi · on March 19, 2010

It's true that s-expressions don't give you a literal syntax for hashmaps or arrays, but note that some lisps (e.g. Clojure) do support these natively … and in a syntax that's better than JSON since you don't have to type all those annoying commas ;-)

sophacles · on March 19, 2010

You know I never quite got that comma thing. For instance in python, the only reason it is necessary is because {"foo": "bar" "baz", 'k2':'v2'} is the same as {'foo':'barbaz', 'k2':'v2'}. Getting rid of auto-concatination allows for {'foo':'barbaz' 'k2':'v2'} which is pretty easy to parse base on tokens and ':'. It also eliminates a pretty common bug, in which there is kv pair per line, and the line ends with ',', except for the last line which does not have the ','. (The following line has a '}'). The common bug being adding a new kv pair and missing the ','.

blasdel · on March 20, 2010

Python lets you put the commas before the items, like so:

  { 'foo':'barbaz'
  , 'k2':'v2'
  }

Ruby doesn't like that though. Both Python and Ruby will ignore an extra trailing comma on the last item, which gives you another way of avoiding the common bug you describe.

weichi · on March 20, 2010

I've never written any Ruby, but I refuse to believe that Ruby cares about which line has the comma. Am I misunderstanding something?

anamax · on March 19, 2010

> It's true that s-expressions don't give you a literal syntax for hashmaps or arrays, but note that some lisps (e.g. Clojure

CL's reader is especially rich, and it's programmable....

anamax · on March 21, 2010

> Yeah, sure. S-expresssions are literals in Lisp, and can be "parsed" calling eval.

No on both counts.

(1) S-expressions aren't literals.

(2) s-expressions are not "parsed" by calling lisp eval. Lisp eval's argument is an s-expression - it doesn't parse anything. (It also doesn't read anything.) Lisp read turns character sequences into s-expressions.

> But JSON wins over s-expressions too, because s-expressions are too simple. How do you represent John Smith as an s-expression?

If you'd like, exactly the same way you'd represent it in JSON, because lisp's read handles a superset of the JSON datatypes.

> software that consumes s-expressions has to know how to interpret their structure

As does JSON.

Take your example John Smith. What JSON datatype do you expect to get? (JSON just has numbers, strings, arrays, booleans, and hashes, where the keys have a restricted format.)

JulianMorrison · on March 19, 2010

Not enough data types in CL or Scheme's reader (Clojure wins here) and no standard between the lisps.

But, there was something Lisp did right that no other mainstream scripting language has - it separated the reader and the evaluator. You can eval Perl or Ruby or Python, but not securely read it. This goes for Javascript too - you can eval a JSON, but you'd be an idiot.

anamax · on March 19, 2010

> Not enough data types in CL [reader]

Huh? CL's reader handles structs, hashes, and vectors in addition to strings, lists, atoms (with packages), lots of number formats (including Roman), and the ability to express AGs, not just trees (yes, cyclic too). Plus some other things that I forget. (Arrays?)

What else do you want to read?

No matter - CL's reader is programmable....

JulianMorrison · on March 20, 2010

You can do hashes in the CL reader? How? I don't recall being able to, from back when I was trying out CL.

It's probably bad you can express cyclic graphs. That means you'll have to watch for denial-of-service attacks phrased as cyclic data.

Programmable reader isn't the point. That gets you back into defining your own syntax. The point is to have a standard.

anamax · on March 22, 2010

> > It's probably bad you can express cyclic graphs. That means you'll have to watch for denial-of-service attacks phrased as cyclic data.

You can turn it off. However, if cyclic graphs (or general DAGs) are important, supporting them means that you don't have to roll your own.

> Programmable reader isn't the point. That gets you back into defining your own syntax. The point is to have a standard.

The writer and reader have to agree no matter what you do. A programmable reader means that you don't have to roll your own in more cases. And, it makes it easier to test the third-party writers. (The reader folks can just publish the read-table.)

charlesju · on March 19, 2010

On a similar token, I'm so happy REST won over WSDL.

Confusion · on March 19, 2010

In what industry is that? It sure isn't in those of my clients :/

wanderr · on March 20, 2010

That's a false dichotomy. JSON-RPC FTW!

tptacek · on March 19, 2010

The "XML requires you to build your own parse tree" argument isn't valid; XML libraries are fully capable of handing you a DOM-style tree, and of allowing you to pull things out of the tree without writing your own traversal code.

JSON just assumes messages are going to trivially fit into naive data structures, and so provides fewer options.

jerf · on March 19, 2010

I'm not sure if you are complaining about this aspect, but I would observe that "fewer options" is actually the feature here, not the bug.

A generic XML DOM is still complicated to deal with. Even if you do the "right thing" and use XPath, you still have to deal with XPath because you can't get around the fact that you have an underlying representation that has at least two dimensions (attributes vs. CDATA). That is, just as the article says, you have more degrees of freedom in how you represent your data, and what is a "degree of freedom" but a near synonym of "dimensionality"? You can't abstract around dimensionality very effectively without losing fundamental capabilities in the underlying component (in fact a staggering number of abstraction failures in general can be shown to come from exactly this problem if you really learn to think this way), and the complexity comes poking out in the XPath. It's still better than groveling over the DOM yourself, but it's probably also the absolute peak of concision that is obtainable; there will be nothing better.

JSON is indeed simpler in that you don't really have 3 or 4 feasible choices per attribute; {"first-name": "John", "last-name": "Smith"} is pretty much your choice, full stop. That leaves the underlying library fundamentally, not accidentally, simpler. This can get you into some trouble in some cases, for instance XML is a better choice for HTML-type tagged text as the JSON for tagged-text is just hideous (and, interestingly, reopens the dimensionality problem as there is no one obvious solution), but many things are fundamentally simpler than tagged-text.

If you want to pick up a defined serialization format, my gut would be to say to default to JSON and back to XML if you really need it for something... but be aware that you may, and it's no better to try to jam JSON on top of a fundamentally XML problem. (Besides, your JSON can carry bits of XML in it without much pain, so "best of both worlds" is perfectly feasible.)

tptacek · on March 19, 2010

Whoah wait hold on a sec. I'm not sticking up for XML. I'm just saying that one argument isn't valid. I'd use JSON.

(Although I like it when my target web apps use XML; better tools support for attacking them.)

jerf · on March 19, 2010

Righty-o, like I said I wasn't sure. :) But I figured it was still worth posting. I don't see much level-headed analysis of the issues. Too many devs got burned by XML then can't help but get a little fanboy-ish over JSON, which has distorted the dialog a bit, I think. And I like getting the idea of API dimensionality out there.

binspace · on March 19, 2010

The only thing I miss when using json are the css selectors when trying to find data in a nested structure.

With json I have to resort to imperative means and care about the middle layers. That means refactorings are more likely to break software.

Fundamentally there is no reason why json cannot have it's own selector which traverses the tree. There just isn't one that I know of though.

jerf · on March 21, 2010

Actually, it's really easy to write, if you want to. I haven't had any need in my big JSON project (everything ends up pretty hierarchial so any query like "give me all the object attributes that are 'xyz'" isn't useful), but if I needed one it would be easy to write. A good learning exercise for recursion, if you're not familiar with it.

tjgabbour · on March 19, 2010

What is the advantage of XML over JSON for nontrivial, non-naive data structures? Do you mean things like XLink/XPointer? (Or where can I go to read more about this?)

I've gotten the informal impression that incidental complexity (you know, complexity arising not from the problem but the solution) is a factor behind the hugeness of the XML ecosystem, but I'd be happy to learn otherwise...

uriel · on March 20, 2010

> I've gotten the informal impression that incidental complexity (you know, complexity arising not from the problem but the solution) is a factor behind the hugeness of the XML ecosystem

You are absolutely correct.

As Phil Wadler put it: "The essence of XML is this: the problem it solves is not hard, and it does not solve the problem well."

It makes a trivial issue into a byzantine enterprise.

But, it has created a whole industry of 'experts', standard committees and other busybodies, so it must be good for the economy at least!

jbooth · on March 19, 2010

Yeah, but the downside to doing that with XML is it's frickin crazy. I have to parse this whole thing into memory and then pull out the part I want?

The CPU is spending more time dealing with XML than it is doing useful work. If it's a big file, this is very significant.

In JSON, you get the same expressiveness without the hassle.

tptacek · on March 19, 2010

I'm not sure what you think your JSON parser is doing.

axod · on March 19, 2010

Come on. Any one of us could write a JSON parser in a page of code. An XML parser is significantly more work due to the overcomplexness of XML. Both to write, and for the CPU to run. And it's larger. In fact I'm struggling to think of something XML has that is good. (And please no one respond with "it's extensible" or I may explode).

An XML parser has to track open tag names, with JSON it doesn't matter. XML has all stupid entities like & which look ugly and need to be parsed.

I think "time to write a parser" should be a good metric on how sane a data format is. The fact that writing an XML parser that covers all bases/eventualities is a major undertaking says alot about the data format.

jerf · on March 19, 2010

XML brings a lot of other baggage to the party. Some of it can be useful. I particularly like XML namespaces, because when used properly there's hardly an equivalent in any other standardized serialization format I know. (And part of the problem is that you really need it standardized at the serialization format level for it to work; you can hack things into any other format but by definition you're not doing it in a standard way.)

However, that statement should be understood through the filter of the fact that I've only seen one thing that uses XML namespaces properly, and that's XHTML. Everything else I've seen gets it wrong, and that includes most things trying to deal with XHTML....

You also get a "free" and modestly powerful validation system, a serialization format that has seriously thought through encoding issues and has answer for them (JSON does too, but a lot of other fly-by-night stuff doesn't), a fairly powerful format for tagged text (JSON-tagged text is a hack no matter how you slice it). You also get XSL, which floats some people's boats, though I wouldn't be caught dead working in it.

If you don't need any of that, don't use it. I don't very often. But when you need it, do. Also:

"An XML parser has to track open tag names, with JSON it doesn't matter."

This is equivalent to JSON needing to track {, [, ', and ", among other things. That's just parsing; both JSON and XML need to be parsed. That's not an advantage.

"XML has all stupid entities like & which look ugly and need to be parsed."

This is equivalent to the escape sequences in JSON: http://json.org/string.gif They also need to be parsed, they do not magically turn into bytes without that.

axod · on March 19, 2010

>> This is equivalent to JSON needing to track {, [, ', and ", among other things. That's just parsing; both JSON and XML need to be parsed. That's not an advantage.

JSON can simply count brackets. That makes for a very simple parser indeed. XML needs to cope with invalid nesting, end tag names not matching start tag names etc. End tag names are just wasted space.

>> This is equivalent to the escape sequences in JSON: http://json.org/string.gif They also need to be parsed, they do not magically turn into bytes without that.

But those are sane. We all escape double quotes and backslashes in pretty much every programming language. They make sense in a very simple encoding.

Why should I need to replace & with %amp; they seem arbitrary, and the replacements aren't simple. " ?? seriously? you're naming characters with odd abreviations, and then expecting people to remember those? why not just escape them with a prefix such as erm.... "\"

viraptor · on March 19, 2010

> XML needs to cope with invalid nesting, end tag names not matching start tag names etc.

I think you read a different spec ;) http://www.w3.org/TR/REC-xml/#sec-logical-struct :

    Well-formedness constraint: Element Type Match

    The Name in an element's end-tag MUST match the element type in the start-tag.

XML parser rejects the document as soon as this occurs. No different than with `{]` in JSON.

Changing " into " allows you to go to the next " without worrying about the contents before. The next " is the ending quote. Then you can resolve all the internals lazily... saving on processing time compared to JSON. For example "\"\\\"" will go through many branches and conditions. ""\"" is a simple jump over to the next " character.

tptacek · on March 19, 2010

You know, I know we're just geeking out here so please don't read too much into me taking the devil's advocate position, but: yes, XML is harder to parse, but the underlying data that gets encoded in JSON and XML isn't substantially different.

In both cases --- and let's take the C implementation case --- you're still building a poorly specified buggy implementation of Tcl to actually hold the data and answer questions about it.

That's the point I'm trying to make.

jbooth · on March 19, 2010

Dunno why you got downmodded there..

Anyways, if they're equivalent, and JSON is both easier to parse and has a far higher data density.. why would anyone ever use XML for anything?

eru · on March 19, 2010

Three-legged races are fun.

uriel · on March 20, 2010

> why would anyone ever use XML for anything?

I ask myself that every day.

XML is a huge scam perpetrated on the software industry, but now it is too late because a huge parasitic 'industry' has built around it, and too many people (specially too many PHBs) have invested their reputations on XML being the ultimate standard for representing data.

axod · on March 19, 2010

Can you explain what is lacking in JSON? What is poorly specified or buggy?

jbooth · on March 19, 2010

Well, in XML you're prone to files like (lets see how news.yc renders this):

<thingies> <thingy> <name>thingy1</name> </thingy> <thingy> ... </thingies>

I've seen that a billion times, if you're doing DOM, you have to pull that whole thing into memory and then run over it again. Most JSON-based storage systems I've seen recently are more record based so you can stream it through.

So I guess my beef on that one isn't specifically with XML, you could split the above snippet into separate entities.. but I'll note that it still involves about 90% markup and 10% data. Not exactly the most efficient thing possible.

m_eiman · on March 19, 2010

I have to parse this whole thing into memory and then pull out the part I want?

You do that with JSON too, don't you? Are there SAX-like parsers for JSON?

If it's a big file it'll be big in JSON too.

loup-vaillant · on March 19, 2010

If the file is big, chances are that it is basically a sequence of relatively small chunks. You don't need SAX parsing for this. Just read the file one chunk at a time.

jmillikin · on March 19, 2010

There are a few event-based JSON parsers; yajl is the most famous, though its interpretation of the JSON spec is a bit liberal for my tastes.

william-shulman · on March 19, 2010

The point here (in the orig article) was that in some cases you do (i.e. if all you have is a SAX parser - Obj-C iPhone SDK being singled out as an example where only a SAX parser is supplied by the SDK)

pkulak · on March 19, 2010

You still have to parse whatever your parser gives you into something you can actually use.

TomasSedovic · on March 19, 2010

He nailed it:

XML should be used for markup, JSON or YAML for structured data.

Qz · on March 19, 2010

Isn't YAML a subset of JSON? or vice versa?

blasdel · on March 20, 2010

YAML had a fuckton of sugar, where JSON has as little as possible, but the biggest differentiator is that YAML has references — you can express cyclic graphs. That's why it's a natural fit for fixtures and seed data in Rails: it natively handles relational data.

masterj · on March 19, 2010

My understanding is that YAML is a superset of JSON.

lenni · on March 19, 2010

I can't really understand all the XML-bashing. Of course it isn't the right tool for any job, but I like it for its schema and validation features.

rgoddard · on March 19, 2010

Mostly it is a reaction to the over use of XML and how and why in many situations using something other then XML is beneficial. If a large swath of people of start using JSON without thinking about you will probably start seeing a similar reaction extolling the virtues of another format over JSON.

lincolnq · on March 19, 2010

Yes, those are good features.

The trouble was that it was the all-purpose data exchange tool for a long time, and everyone has seen XML put to all sorts of ungodly purposes. If it were invented today, nobody would be bashing it, because it would be less widely used -- the few people who did decide to use it would be putting it to good use, so they would love it. Everyone else would be using something else that was simpler.

ntownsend · on March 19, 2010

The argument that XML can have various different structures for storing a person's name, while JSON provides one simple solution, doesn't fly. You could run into something like { "Person": { "property": { "type": "first-name", "value": "John" }, "property": { "type": "last-name", "value": "Smith" } } }

This begs the question, "Why would you do something that convoluted?" Well, you can ask the same of the XML examples, and the answer probably boils down to requirements (or incompetence?).

arockwell · on March 19, 2010

My gut feeling on this is that almost anyone can recognize that writing JSON like that is wrong. However, trying to determine if

  <person first-name="John" last-Name="Smith" />

is better than

  <person>
      <first-name>John</first-name>
      <last-name>Smith</last-name>
  </person>

is often very difficult. In the wild, I've seen both strategies used depending on the situation.

yread · on March 19, 2010

Actually it's very simple. If you don't need children don't make them. I measured it once and storing stuff in attributes had ~10 times faster parsing in MSXML.

JeremyStein · on March 19, 2010

Please don't write "begs the question" when you mean "raises the question". http://begthequestion.info/

stanleydrew · on March 19, 2010

But if we know what he meant why does this matter? Language exists to convey meaning, and I think we all understood what he meant, so I don't see the problem here.

loup-vaillant · on March 19, 2010

The problem lies in dilution: if people recognize a new meaning in an old phrase, it becomes more difficult to convey the old meaning. (Because you can't use that phrase any more.)

There is also the case where you thought you conveyed the new meaning, while it hasn't caught on yet (meaning, you made a mistake). So, better stay safe and stick to the old meaning while we can.

vidarh · on March 19, 2010

The dilution has already happened, and this cause is pretty much lost.

English is my second language. I've known the "new" meaning since I was a kid.

I've to date maybe seen the "old" meaning used a handful of times other than in examples given by people trying to correct someone using the new meaning.

Outside of academia I'd be surprised to see it at all. I suspect it would be confusing to more people than would recognize it.

Confusion · on March 19, 2010

The dilution has already happened, and this cause is pretty much lost.

Don't give up too easily.

Outside of academia I'd be surprised to see it at all. That's because it's originally an academical term for a logical fallacy. This is like the abuse of 'eigenvalues' by all kinds of crackpots and we should never stop fighting that kind of language abuse. We can't keep inventing new terms, just because others have hijacked the previous one.

Confusion · on March 19, 2010

The dilution has already happened, and this cause is pretty much lost.

Don't give up too easily.

Outside of academia I'd be surprised to see it at all. That's because it's originally an academical term for a logical fallacy. This is like the abuse of 'eigenvalues' by all kinds of crackpots and we should never stop fighting that kind of language abuse. We can't keep inventing new terms, just because others have hijacked the previous one.

loup-vaillant · on March 19, 2010

OK for this particular cause. I was more about a general stance: trying to slow down the rate of change in languages. The slower the change, the less confusing the language.

sjs · on March 19, 2010

"I know this sounds like a semantic quibble, but words mean things." -- unknown

jamesbritt · on March 19, 2010

"But if we know what he meant why does this matter? Language exists to convey meaning, and I think we all understood what he meant, so I don't see the problem here."

I often don't know what the user meant. It makes for extra work to stop and think what possible meaning was intended. Some thing when people use "less" for "fewer". I have to think, did they mean "lesser" or "fewer"?

dasil003 · on March 19, 2010

I respect the difference, but I'm afraid there's not much point in trying to educate people. This has already been an utterly lost cause for decades (at least).

ntownsend · on March 19, 2010

Whoops. You're right. I'm embarrassed because I do know the difference.

sfk · on March 19, 2010

The Oxford Dictionary lists "raises the question" as a valid interpretation of the phrase.

wlievens · on March 19, 2010

Or library defaults. X-Stream, for instance, will by default use subtags rather than attributes, for primitive properties.

tomkinstinch · on March 20, 2010

I prefer JSON over XML for most applications, but one advantage of XML is that its strict structure aides parsers. With XML it's easy to see if you've closed all of your tags, etc.; it's all about the parser.

Conversions between XML and JSON can be a challenge (defining namespaces,etc.). The Google Data API handles this very well. Check out their nice side-by-side example: http://code.google.com/apis/gdata/docs/json.html

stanleydrew · on March 19, 2010

Well JSON also lets you get around browsers' same-domain policy without having to set up a proxy, which I think might be more important.

pilif · on March 19, 2010

if you are talking about JSONp: JSON is by no means required to do that. You could in theory easily pass a string containing XML to the callback function.

stanleydrew · on March 19, 2010

Yes, good point.

euroclydon · on March 19, 2010

There are so many cool things you can do with JSON. I just created a boolean logical statement builder on a web page by using free form open and close parenthesis, and and/or radio buttons. A little regex to change the parenthesis into square brackets, then an eval(), and I can walk the whole statement recursively.

stcredzero · on March 19, 2010

eval() strikes me as a dangerous thing, security-wise.

euroclydon · on March 19, 2010

How? I'm evaluating strings that are built with my javascript code or from my server code, not arbitrary user input from other users. Yes the user is able to enter parenthesis into a textbox, and those become part of the evaluated string, but I regex replace out everything but the actual parenthesis.

tptacek · on March 19, 2010

Every security assessor's favorite answer to a threat: "but I regex out everything unsafe".

euroclydon · on March 19, 2010

How is building an intermediate portion of a nested logical statement, using eval() dangerous?

Here is the regex for what I allow (only '[' & ']'):

  value.replace(/[^(]/g, '').replace(/\(/g, '[')
  value.replace(/[^)]/g, '').replace(/\)/g, '],')

I guess the point you all are trying to make is that some javascript text could have been maliciously inserted into the page somehow, and accidentally get eval'd simply because eval is in use, but the page below says that one of the only times eval should be used is to build up complex mathematical expressions. Is there a safe way to build up such expressions? Send it to the server, and invoke a JS engine there? The reason I made my original comment was because I found the eval function to be so helpful in this scenario, b/c I didn't need to use any type of syntax parsing.

http://blogs.msdn.com/ericlippert/archive/2003/11/01/53329.a...

tptacek · on March 19, 2010

If you are confident about charsets and you whitelist down to known-good characters ([A-Za-z0-9_ \t]) I have nothing snarky to say about the design. Otherwise, try reading this very short thread:

http://www.webappsec.org/lists/websecurity/archive/2010-03/m...

(It's not exactly your problem but you'll get the flavor.)

stcredzero · on March 19, 2010

Yes, but the security of that part of your system could be pretty implicit, hence less than robust in the face of maintenance coding. The idea that "this is safe because everything but the parens are filtered out" won't necessarily jump out of the code at whoever maintains it.

If the code is structured to reveal this intention explicitly, then job well done.

(Note: It's not always a good idea to rest the future security of your system on a comment in the code!)

giardini · on March 19, 2010

Rather than "regex replace out" problematic elements, you should immediately toss the data back to the user for correction. Trying to "correct" a potential hacker's string is often a losing proposition.

Jach · on March 19, 2010

It seems to me like it's only dangerous in the context of XSS or bad server-side validation. With Firebug I can run any arbitrary JS on any website I like, even change what's already there. But that just affects me, unless the site uses what I can modify on the server and doesn't validate it there.

Luyt · on March 19, 2010

Isn't it dangerous to eval()? Suppose someone put some malicious code in the JSON data?

DrJokepu · on March 19, 2010

The question to ask is: is the JSON coming from a trusted source (e.g. generated by your server and you can reasonably sure that you have no funny XSS holes)? If the answer is yes, eval()-ing JSON is perfectly safe. If the answer no, you need to parse it without the help of the JS interpreter.

eru · on March 19, 2010

And you should probably err on the safe side.

sp332 · on March 19, 2010

Firefox has a native JSON parser. http://blog.mozilla.com/webdev/2009/02/12/native-json-in-fir...

wlievens · on March 19, 2010

jQuery includes a proper parser

Hexstream · on March 19, 2010

I think it was even being incorporated in browsers natively for maximum performance. It might be available already.

enomar · on March 19, 2010

http://ejohn.org/blog/the-state-of-json/

euroclydon · on March 19, 2010

Where?

njharman · on March 19, 2010

It's easy, it works.

eplanit · on March 19, 2010

The argument is always the same, and the JSON crowd always assert "superiority" derived from sheer simplicity. XML is too hard for them, and they're part of some bizarre quasi-political movement that trashes object-oriented principles. Thus, 'simple' trumps validation via type, version control (and hence interoperability over time without breakage and maintenance), robustness.

The weakness in the Javascript "eco-system" is it's quasi (the 'quasi' qualifier applies frequently in this eco-system) support of objects. XSD/XML is powerful when used in an object paradigm, and object paradigms have proven (not just via community claims) to be very effective.

The question is, when will the JS community step up to the plate (i.e. mature)? So much energy is wasted now on making JSON work -- just in order to make Javascript easy. Wrong priorities.