Hjson, the Human JSON

simonw · on April 14, 2016

JSON plus comments and python-style multi-line strings is great.

The thing where you can leave quotes off strings makes me nervous, especially the example where the value is HTML with its own embedded double quotes for attribute values.

Not requiring quotes on strings like that looks like an obvious vector for injection attacks. I guess Hjson isn't designed to be generated automatically, but I'd prefer a format that is easy to generate safely.

What I really want is JSON plus comments plus multi-line strings plus relaxed rules on trailing commas... While maintaining as simple and unambiguous a parsing model as possible.

david-given · on April 14, 2016

If trailing commas and quotation marks are optional, then does this:

    foo: 4,

...produce the value 4, "4", or "4,"?

alanh · on April 14, 2016

Excellent example.

While I have been hoping for "JSON plus comments" to be a real and common thing for quite a while now, one of the strengths of JSON is right there on json.org. See it? A set of five simple syntax diagrams that entirely and virtually unambiguously define the json syntax.

It’s tough to know when to stop when simplifying a syntax. (For an extreme example, see Stylus, which was like Sass but so extreme that mixins and properties became ambiguous for each other.) I, too, would like to see the return of quotes for string values, for increased clarity.

PuerkitoBio · on April 14, 2016

From the docs:

  > When you omit quotes the string ends at the newline.
  > Preceding and trailing whitespace is ignored as are escapes.
  >
  > A value that is a number, true, false or null in JSON is 
  > parsed as a value. E.g. 3 is a valid number while 3 times is 
  > a string.

(edit: formatting)

lumpypua · on April 14, 2016

Right but the point is that you had to consult the docs. It's a problem of ambiguity on the part of the format.

PuerkitoBio · on April 14, 2016

It is ambiguous only if you think it can be a completely useless format. If you decide to use it, chances are you considered it useful, and the only useful way this can parse is as a number, otherwise it would't be able to represent numbers, booleans and null. I'd say it parses pretty much as expected, and as such, contrary to other comments, you don't have to go back to the docs every so often.

chris-at · on April 14, 2016

Of course you have to get the information from somewhere the first time you learn something. What's the problem with that?

dandelany · on April 14, 2016

The problem is that this is supposed to be a more-easily-readable form of JSON, but it requires consulting the docs to understand the meaning of something which is completely unambiguous in regular JSON.

ssalazar · on April 14, 2016

> Of course you have to get the information from somewhere the first time you learn something. What's the problem with that?

The first time, and the second time if you haven't looked at it in a month, and a third time a month after that. And so on.

bobbyi_settv · on April 14, 2016

So if you put "true" (without quotes) as a value, you get a boolean, but if you put "True", you get a string?

cortesoft · on April 14, 2016

So would that be "4," that you end up with, then?

jcoffland · on April 14, 2016

I came here to make the same comment. The ability to leave off quotes on strings is a misfeature which overly complicates the language. Also I see no need for three different comment styles.

PhiLho · on April 15, 2016

Yes! There is the same ambiguity in Yaml, so syntax highlighters of the format don't agree where a string ends...

The site even has an ambiguous example: three: 4 # oops Well, is that "4 # oops" or 4? I saw no rule about ending comments, and they say that three: 3 times is a string.

I have seen no formal specification of the grammar, so we already have lot of ambiguity. Good luck to implementers...

Seriously, removing the "no quotes needed" rule would improve greatly the format. If you want to include HTML with double quotes literally, just use the multiline string format and be done.

nix0n · on April 14, 2016

Use what you want, ignore the rest. Like C++.

creshal · on April 14, 2016

That might work for programming languages (C++'s bad rep nonwithstanding), but for data exchange formats you cannot cherrypick features, you have to support the whole spec, and nothing else.

kbenson · on April 14, 2016

I think it's more nuanced than that. You must be strict in what you emit, but can be liberal in what you accept. That liberalness can go too far though, and should not make parsing brittle, or encourage misuse of the spec. It's there more to allow someone's unambiguous mistakes to still parse.

- Accidentally left in a final comma on a list? That's okay, that only means one thing, we understand.

- Allow non-quoted keys on objects? Well, we understand JavaScript generally allows this, so we'll let it slide. This time.

- Make newline significant and define new items? Okay, are we just ignoring space efficient payloads now? Should making it space efficient mean changing formats from Hjson to json?

- Considering all terms in place of a object value a string until a newline? Are you just trolling me now? How is that more human readable? Does your spoken language not use quotes to distinguish distinct chunks of communication or something, and if so, does it use a Latin alphabet so it's off-putting when you see them?

Needless to say, I'm really confused by the reason this even exists.

ZenPsycho · on April 14, 2016

I think the problem it is solving is that JSON is designed and best used as a data exchange format, but it also gets used for configuration files, which it does okay with but is not really so good. INI files don't have a clear standard. YAML is too complicated, and using turing complete javascript for configuration seems like you've just gone too far.

we just need JSON, but with a couple things fixed up to make it nicer to use for configuration files.

kbenson · on April 14, 2016

> we just need JSON, but with a couple things fixed up to make it nicer to use for configuration files.

Using JSON for configuration is just the whole situation of using XML for data exchange redux. One of the major points for JSON over XML for data exchange was that it was so much better because it was optimized for data, not markup. Why are we ignoring this argument now that JSON is on the other side? JSON is used for configuration because it's ubiquitous, not because it fits the problem domain well. Let's just choose a more appropriate format.

Choosing the most common set of rules for INI files (what is proposed by Wikipedia[1] is probably sufficient) would serve us MUCH better than coaxing a data interchange format into that role.

1: https://en.wikipedia.org/wiki/INI_file

lomnakkus · on April 15, 2016

If you're using a strongly typed language, XML even has one (IMO) massive advantage over JSON: You can use XSD to define a schema declaratively. This means you get

(1) lots of general tooling support, in particular you get at least decent editor support for your config file, and

(2) you can autogenerate the code needed to read your configuration into structured data without having to do any unnecessary duplication of "key names" (tags) as you have to with e.g. INI or JSON. You also get the data sturctures themselves for "free" (based on the XSD).

Alright, it's not the end of the world to not have these things, but they're both very nice to have.

ZenPsycho · on April 15, 2016

have you seen http://json-schema.org ?

lomnakkus · on April 15, 2016

Yes indeed I have, and I have several problems with it, one of which is "Expires: August 3, 2013" with no new version in sight. My other major problem is that AFAICT it doesn't support one of my pet favorite features namely "Algebraic Data Types"[0, 1]. If you extend json-schema like Swagger has done it might support ADTs properly AFAICT, but I don't have any actual practical experience with Swagger.

[0] https://en.wikipedia.org/wiki/Algebraic_data_type [1] I should note that support for ADTs in XSD is sometimes sketchy in the various code generators, but at least the XSD specification supports it. (It could also be argued that XSD supports something even more general... which it probably shouldn't since ADTs basically cover the whole data structure space unless you go to higher kinds, inheritance and such.)

ZenPsycho · on April 15, 2016

well basically virtually anything other than JSON that's actually designed to be configuration would be better for configuration. There's dozens of them. that is the problem: JSON parsers and generators are ubiquitous in the way that no single configuration format is. Right now, I can just use json in any language and get data between any language and any other language. If I use it for config I get the advantage of even being able to config across multiple languages that might be getting used in a single system if I have to. No proper configuration format has that level of mindshare and interoperation.

thenonameguy · on April 15, 2016

Something like edn[1] perhaps? Worked really well in my projects.

[1]: https://github.com/edn-format/edn

daurnimator · on April 15, 2016

https://tools.ietf.org/html/draft-thomson-postel-was-wrong-0...

kbenson · on April 15, 2016

While I agree on some points, I do not agree in general. I think too much prescriptivism in protocol implementation is a naive approach, and assumes that we can always get things right initially. Sometimes real-world concerns and needs drive changes, not just sloppiness.

talles · on April 14, 2016

I prefer to openly discuss the features while giving the authors feedback instead of just ignoring what I don't want, but that's just me.

bpicolo · on April 14, 2016

Features that cause ambiguity make things harder to do right.

d33 · on April 14, 2016

Well, the first thing I thought of is what nightmare would it be to safely implement a parser for this in C. I filed a Github issue for this one: https://github.com/laktak/hjson/issues/37

jaquers · on April 14, 2016

Have you looked at JSON5? http://json5.org/

paulannesley · on April 15, 2016

JSON5 looks really good. I'm not sure about this one part though, from a point of view of simplicity and interoperability when deserializing the data:

> Numbers can include Infinity, -Infinity, NaN, and -NaN.

echlebek · on April 15, 2016

This is a good thing. It means there is a 1:1 mapping between numbers in JSON5 and IEEE floats.

novaleaf · on April 14, 2016

i've used json5 since the beginning. it's great! just a couple week ago they merged in descriptive error messages too. No more wondering where the bug is!

zyxley · on April 14, 2016

Strings without quotes leads to all kinds of trouble in YAML. You end up just quoting everything anyway the first time you need to use "false" or "[some words here]" as a string.

crdoconnor · on April 14, 2016

This ought be fixed by making YAML explicitly rather than implicitly typed.

E.g.

   x: yes

   >>> str(yaml['x'])
   yes

   >>> bool(yaml['x'])
   True

kbenson · on April 14, 2016

> The thing where you can leave quotes off strings makes me nervous, especially the example where the value is HTML with its own embedded double quotes for attribute values.

Learn from Perl. The quote operator is your friend (and I frequently lament it's omission in Bash). You could simplify it by not using the matching enclusures ({ and }, [ and ], etc). It's easy to parse. and if you keep the quoting character somewhat rare, it's not hard to read.

E.g.

    {
      "string" : "A string without inner quotes",
      "quotes1" : q!A string "with" inner quotes!,
      "quotes2" : q|A string "with" inner quotes|,
      "quotes3" : q@A string "with" inner quotes@,
      "quotes4" : qTA string "with" inner quotesT,
   }

Edit: To be clear, I wish JavaScript had a quote operator, and JSON started with it. :/

1: http://perldoc.perl.org/perlop.html#Quote-and-Quote-like-Ope...

fredley · on April 14, 2016

I haven't used Perl in quite some time, and this is the sort of thing is why it was bad. Quotes are quotes are quotes in almost every language. It's completely unambiguous, the downside is that you sometimes need to escape them.

This, on the other hand, is a 'solution' to escaping quotes that is completely mad. Using non-standard quotes, especially mixing and matching them is a disaster for readability and maintainability (using a T in your string now? need to change the quotes!). Triple quotes are just find if you want to avoid escapes, and hjson seems to support them.

__david__ · on April 15, 2016

> This, on the other hand, is a 'solution' to escaping quotes that is completely mad.

Meh, Perl's solution is fine. You can throw up your hands and say it's crazy, but as a person who worked with Perl for 20 years, I've never had the problem you describe. I tend not to use the qq() or q() style quotes, but I've used s@@@ and s,,, so many times I can't count. It's really quite nice (and perfectly readable unless you do something weird like 'sxxx'.

kbenson · on April 14, 2016

> Quotes are quotes are quotes in almost every language. It's completely unambiguous

Oh, like in C and C++ where single quotes denote a char, and double quotes a string?

Or in Perl, PHP and Ruby where double quotes interpolate, and single quotes don't?

Or in JavaScript and Python, where there's no functional difference between single and double quotes?

Or C#'s string literals which only support double quotes, but you prefix the string with @ to denote it's verbatim?

Or systems that allow repeated double quotes within a double quote string literal to stand in for an escape ("foo ""bar"" baz"), as many SQL systems do?

Or what about systems that interpolate, and the differences between what they do and do not interpolate? Variables? Escape characters? Hexadecimal escapes?

You're fooling yourself if you think it unambiguous in anything except for the language you are dealing with, and if you're within that language, who cares what you use as long as it's consistent? You learn it, and then it's unambiguous (if implemented well).

This is no different than if your language supports hex numbers (usually done through prefixing it with 0x). Those are two different ways to specify the exact same thing (a binary number!). The benefit comes from using it in the circumstance where it's appropriate. That is, where it enhances readability, not where it detracts from it.

> This, on the other hand, is a 'solution' to escaping quotes that is completely mad. Using non-standard quotes, especially mixing and matching them is a disaster for readability and maintainability

Maybe you think

    "{\"foo\":\"bar\",\"baz\":\"it's that's they're we're\"}"

is fine, or you may prefer

    '{"foo":"bar","baz":"it\'s that\'s they\'re we\'re"}'

(if your language supports it).

I prefer

    q|{"foo":"bar","baz":"it's that's they're we're"}|

because I think it's clearer, and learning once that a literal q defines a new quote operator that is in effect until it's next seen is simple, easy to remember, and yields very useful readability gains.

> using a T in your string now? need to change the quotes!

I included the qTT example just to show how it worked, not to endorse its use. I thought that would have been obvious from my statement "You could simplify it by not using the matching enclosures ({ and }, [ and ], etc). It's easy to parse. and if you keep the quoting character somewhat rare, it's not hard to read."

In any case, I fail to see how how that's a problem beyond any other quote character. Including that character in your string will result in a compile time error in all but the most esoteric of cases, making it easy to find.

__david__ · on April 15, 2016

And don't forget Tcl where { and } are actually quotes.

draegtun · on April 15, 2016

And similar with Rebol / Red where you have two distinct quoting literals...

  "string with no newlines"

  {A multi-line string
  that can run over over many lines
  and {even be} nested}

For unbalanced {} string then you would need to escape it...

  {escape closing brace ^}  &  opening brace ^{ is all you need}

aleem · on April 15, 2016

Do checkout http://json5.org/ based off the original JSON author Doug Crockford's own proposed parser extensions, primarily trailing commas and comments, both of which have been a point of contention pretty much the day since the day JSON landed. It's a much simpler and saner proposal.

Trailing commas and JSON comments are are already supported in the newer browsers (try the Chrome console for instance).

Fortunately quoteless strings or optional-commas/newline-separator as proposed in Hjson will never fly. They are brittle and ambiguous. Who knows what will this get parsed as:

    {
        a: hello's and hi's have
            'misplaced' apostrophes
        b: ball: a round # and # bouncy object
        c: cakes and
            candy: both have sugar
            # but how do I include a hash at the start of a multiline-unquoted string?
    }

masklinn · on April 15, 2016

> Trailing commas and JSON comments are are already supported in the newer browsers (try the Chrome console for instance).

Version 49.0.2623.112 (64-bit)

    > JSON.parse('{"foo": "bar",}')
    > VM124:1 Uncaught SyntaxError: Unexpected token }

Javascript object literals != JSON. JSON is a restricted subset of JS object literals (and not actually a strict subset: a JSON string can contain unescaped U+2028 "LINE SEPARATOR" and U+2029 "PARAGRAPH SEPARATOR" codepoints, a Javascript string can not)

mamikonyana · on April 14, 2016

how about yaml?

http://www.yaml.org/start.html

alanh · on April 14, 2016

YAML is more complex that most people tend to realize. (This was brought up in a 2011 discussion about possibly standardizing a metadata section for Markdown documents which sadly went nowhere. [1])

Take a look at example 2.11 in the YAML spec [2], for example, and see if you can make heads or tails of it.

[1]: https://pairlist6.pair.net/pipermail/markdown-discuss/2011-A...

[2]: http://www.yaml.org/spec/1.2/spec.html#id2760395

detaro · on April 14, 2016

pyYAML has this collection of problems with the spec: http://pyyaml.org/wiki/BugsInTheYAMLSpecification

(and pyYAML itself can't always parse its own output correctly...)

crdoconnor · on April 14, 2016

You don't need most of those features. A pared down YAML with the cruft removed (implicit typing, flow style, tag tokens, node anchor & references) is actually pretty simple as well as less "gotcha-y".

__david__ · on April 15, 2016

Sure, but most language YAML parsers support all or most of the spec. That can be a problem if you aren't expecting it.

alanh · on April 15, 2016

I believe it has even created security issues. Didn’t Rails have at least one YAML-based vuln?

rurban · on April 17, 2016

You need to restrict YAML to SecureLoad, with manually adding allowed typed and classes.

At least perl doesn't support this, so it's inherently insecure there, but you can always use YAML::Syck which didn't go this way.

TranquilMarmot · on April 14, 2016

From hjson.org:

"YAML expresses structure through whitespace. Significant whitespace is a common source of mistakes that we shouldn't have to deal with."

jholman · on April 14, 2016

Okay, does anyone actually believe that causes a problem?

I can think of ONE time when that causes a problem, and that's with indentation with multi-line strings. Oh look, HJSON included that feature. That's like throwing the baby out and keeping all the bathwater.

gboone42 · on April 14, 2016

I can't find a specific example off the top of my head but I'll say I've been managing a Jekyll site for a while now and whitespace errors in frontmatter and data files cause all kinds of problems. I'm not sure I could explain the details but it's a legit criticism of YAML. IMO part of the problem is that YAML looks very straightfowrard and is until it suddenly isn't. Whitespace is part of that problem.

kaishiro · on April 15, 2016

As an anecdote on the flip side, I've been building Middleman sites for a while now and can't remember ever having an issue with whitespace in the front matter or local data.

BobTheCoder · on April 15, 2016

Yeah I'm with you. Developers should have no problem dealing with whitespace and the result is you get a easier to read format.

Although admittedly I haven't had to work with YAML a lot but I have liked it when i've touched it.

bpicolo · on April 14, 2016

yaml has any number of ambiguous cases

andrei_says_ · on April 14, 2016

Could you provide some examples?

bpicolo · on April 15, 2016

Unquoted strings are valid in yaml just like this format. There are at least 2 ways to specify a list of things. There are some super bizarre looking possible formats for lists of mapping types.

There are a number of others given the length of the spec. Yaml is a complicated beast that generally has more than one way to do any given thing

JoshTriplett · on April 14, 2016

I like that you can leave the quotes off of keys, which should always parse as identifiers. (And those that don't should require quotes.) Leaving the quotes of values seems like the problem.

rymohr · on April 14, 2016

Take a look at HONEY: https://github.com/honey/honey

Primary goals were to remove as much syntax as possible and make it play well with line-based diffs (with the hopes that someone who knows knowing about the language could resolve conflicts without getting tripped up by surrounding quotes, trailing comments, etc).

lomnakkus · on April 15, 2016

Unfortunately, conflicts in white-space based languages can get even worse than regular conflicts because you have very few visual structural "anchors" to start to gain an understanding of the conflict. (If you have to resolve manually, that is.)

Granted, if the number of conflicts which cannot be automatically resolved is reduced by enough, then it might not matter in the grand scheme of things. However, I'd be worried that this would make "accidental" automatic resolution of semantic conflicts more common. That may be an unfounded/irrational fear, I don't know.

joeld42 · on April 14, 2016

- Someone saves data as text with a simple format

- It works great, lots of people start using it

- People start adding features to fix annoying things with the format, add support for binary data, comments, schemas, add more metadata etc..

- Many versions proliferate, people start writing converters and verifiers

- A standards committee is formed and write an 800 page spec and 80kloc reference implementation

- Eighteen different libraries wrap or reimplement the reference implementation

- Someone gets fed up with this nonsense and converts their app to save their data in a new simple text format.

- The circle of life continues.

I love this idea and wish json had comments, too, but if you start hitting the point where JSON is not expressive or fluid enough, that's a hint that it's probably not the right thing for what you're doing. This variant puts a lot of work into human-friendly json, but if you're doing a lot of hand-editing of a file, it should probably not be JSON.

streblo · on April 14, 2016

Obligatory: https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaG...

cwyers · on April 15, 2016

> Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.

And now you can't roundtrip the comments if for some reason your JSON parser needs to change something.

dclowd9901 · on April 15, 2016

I would prefer not to inspect json for comments. Something is ambiguous, you should have a model definition sent along with everything else.

Animats · on April 14, 2016

Agreed. Do not want "relaxed JSON". There are editors for valid JSON. Use one of those.[1]

[1] http://www.cleancss.com/json-editor/

bryanlarsen · on April 14, 2016

This spec repeats one of the problems with using YAML as a configuration spec. To quote: "if your key includes a JSON control character like {}[],: or space, use quotes if your string starts with { or [, use quotes"

JSON and YAML are interchange formats, not configuration formats. Rather than than hacking up an interchange format, it's probably better to use something designed for configuration formats, like TOML.

ddevault · on April 14, 2016

I think it's reasonable to call YAML a configuration format.

stormbeta · on April 15, 2016

JSON is an interchange format, but YAML is pretty obviously meant for configuration, given the huge emphasis on human readability and editing.

As for TOML, it's a good replacement for mostly-flat INI-style files but the syntax is really awkward for the kinds of places you'd normally use YAML, especially nesting lists/maps.

bryanlarsen · on April 15, 2016

Top line on yaml.org:

What It Is: YAML is a human friendly data serialization standard for all programming languages.

Zardoz84 · on April 15, 2016

or SDLang

JamilD · on April 14, 2016

Rigidity and consistency are not always bad things. They can help prevent bugs, security vulnerabilities, and they drastically reduce complexity of implementation.

JSON might often be too rigid, but I think it's important to note that "easier" (in that you don't need to learn the syntax) isn't always better.

IvanK_net · on April 14, 2016

This "easier" format is actually more "complex". JSON can be described with just a couple of simple rules (http://json.org/ ), while this format adds many new rules (e.g. for doing same things in multiple ways). The more rules you have, the more things you have to remember, the harder it is to find the reason of the problem.

BTW. it is very simple to do comments in JSON :) You can just add "comment1" : "This is my comment", to any object, it will be ignored by software that processes your file.

mort96 · on April 14, 2016

I kind of like doing comments like this:

    "#": " my comment"

ZenPsycho · on April 14, 2016

the problem with that one is that some parsers have a meltdown if they encounter duplicate keys. sigh

abraae · on April 14, 2016

Ha ha, I do exactly the same thing. Lack of comments is really the only beef I have with Json (well, no multi line text too I suppose).

A new spec that just addressed those alone would be great.

kstenerud · on April 14, 2016

Yes. There are two distinct use cases here: configuration files, and interchange formats.

For an interchange format, JSON does the job very well. Small, simple, human readable, easy to implement.

For a configuration format, JSON leaves a lot to be desired. It's almost there, but has enough warts to be annoying.

You're not going to get a one-size-fits-all format.

collyw · on April 14, 2016

This sounds more like Python dictionaries. I don't see more bugs when I am using those over JSON, and I find them a lot easier to work with.

nikolay · on April 14, 2016

JSON5 [0] is better as unlike Hjson, it doesn't include non-ECMASCript syntax.

[0]: http://json5.org/

emodendroket · on April 14, 2016

And yet again no date format. Am I the only person who is ever inconvenienced by this? It seems like it's so obviously the most glaring flaw with JSON that I'm surprised nobody wants to fix it.

rhinoceraptor · on April 14, 2016

Just use ISO 8601?

emodendroket · on April 14, 2016

Sure, that would be fine. They could also use a totally wacky format, I don't care. The problem is that without a standard the JSON tools in different languages don't all agree on how to serialize dates and you end up needing to deserialize/serialize manually the smooth over the differences. The problem isn't that you can't represent dates; it's that there is no standard way to do it.

_kst_ · on April 14, 2016

> The problem isn't that you can't represent dates; it's that there is no standard way to do it.

ISO 8601?

emodendroket · on April 14, 2016

That's not "standard." That's one of many ways that people represent dates in JSON, because the JSON specification itself does not mandate that you use any particular representation. If you use it there is no guarantee it will be recognized as a date on the other end.

Kapow · on April 14, 2016

What black hole are you sending JSON into where the only way they could know something is a date is if you use a date type? Why can't that be part of the data structure you agreed upon in order to communicate in the first place?

eknkc · on April 14, 2016

Even if there is an agreed on schema, parsers can generate native Date objects on the recipient if there is a date type. When you deserialize a nested graph of objects, it's hard to convert each date to an actual date if what you get is just a string. Makes it a lot easier during integration.

emodendroket · on April 15, 2016

Why am I hand-writing code to supplement the parser? I mean, hell, we don't need JSON at all, why don't I just invent my own serialization format and write parsers for it in each environment I want to use it in.

duaneb · on April 15, 2016

Well, it'd need to cover dates, times with time zones, date times with time zones, times without time zones, date times without time zones. Seems better pegged to the data source without thinking it out clearly.

emodendroket · on April 15, 2016

So? It's not like this is a complex problem nobody knows how to solve.

duaneb · on April 15, 2016

Agreed! But it's not so simple to just add a "date" type; it would add significant complexity to a relatively simple text format.

emodendroket · on April 15, 2016

Yeah, I guess, but everyone ends up implementing it anyway; the problem is they all do it in an idiosyncratic way (and one that is ambiguous... after all, maybe I really wanted that to be a string).

eknkc · on April 14, 2016

That is a string. Not date.

Zardoz84 · on April 15, 2016

SDLang have support for date formats.

emodendroket · on April 15, 2016

Cool. Never used D and that's the first I've heard about it but I have respect for any system that handles this problem. :)

kstenerud · on April 14, 2016

Why do you need ECMAScript syntax? It's not like you're going to directly feed a json file into a JS interpreter. Well, not unless you have a death wish.

lucideer · on April 14, 2016

Because it's a syntax that needs no extra explanation.

It's not a superset, so no extra add-on features need extra doc, and it's less of a subset then JSON so the rules of what's disallowed are much simpler.

ninjakeyboard · on April 14, 2016

I feel like HOCON fills the space pretty well, and has implementations in most languages now. https://github.com/typesafehub/config

But I'm a scala developer so I might be biased.

twic · on April 14, 2016

I'm not a Scala developer, and I don't think I'd ever heard of HOCON. Has it seen much adoption outside the Scala community?

talles · on April 14, 2016

I have a (big) .NET project that uses multiple libraries one of them being Akka.NET which uses HOCON. Being honest, it felt completely awkward to use it in this respect (a little corner in the project that uses that different HOCON thing).

creshal · on April 14, 2016

What's the difference to / advantage over YAML? When I want a loose syntax "JSON with comments", I can just use that instead.

burke · on April 14, 2016

> YAML expresses structure through whitespace. Significant whitespace is a common source of mistakes that we shouldn't have to deal with.

> Both HOCON and YAML make the mistake of implementing too many features (like anchors, sustitutions or concatenation).

eterm · on April 14, 2016

But this too has significant whitespace, acting as a comma in separating properties. (It's not clear whether you can mix commas and whitespace here.)

Also this claims to not need escapes, but it's also not clear how this format handles a comma or a newline in strings without escaping, do they act as a comma to separate properties or do they act as natural commas/newlines?

PuerkitoBio · on April 14, 2016

Only quoteless strings have no escapes, and according to the docs the rule is:

  > quoteless strings include everything up to the end of the
  > line, excluding trailing whitespace.

(edit: formatting)

eterm · on April 14, 2016

So this

{

  foo:one,

  bar:two

}

Parses to "one," because it is a quoteless string?

What about true and false, is false the boolean constant or a unquoted string "false".

PuerkitoBio · on April 15, 2016

I posted this in another thread here, but it's documented in the linked page:

  > A value that is a number, true, false or null in JSON is parsed as a value.
  > E.g. '3' is a valid number while '3 times' is a string.

crdoconnor · on April 14, 2016

>YAML make the mistake of implementing too many features (like anchors, sustitutions)

That's why I remove those features.

Significant whitespace is a normal complaint for beginners in python too, but most people prefer it in the end.

k__ · on April 14, 2016

A point against YAML, that I always read, is its specification, too big etc. I don't know much about this issue...

I like how nice config files can look with YAML and JSON being a subset of it makes it even more convenient

seagreen · on April 14, 2016

YAML spec: http://yaml.org/spec/1.2/spec.html

JSON spec: http://www.ecma-international.org/publications/files/ECMA-ST...

If you haven't read the JSON spec and you use JSON I recommend doing so. It takes five minutes. My personal favorite line: "Because it is so simple, it is not expected that the JSON grammar will ever change."

k__ · on April 15, 2016

I did, long time ago.

Also, they're probably right. I mean JSON isn't pretty, but its easy. Why should I use YAML for prettier config files, when the people looking into those files are all technical anyway.

creshal · on April 14, 2016

HJSON isn't exactly a nicely compact spec I'd want to use in data exchange either. For local configuration files and the likes, where you want comments etc., it doesn't really matter how big the spec is.

al2o3cr · on April 14, 2016

"Both HOCON and YAML make the mistake of implementing too many features (like anchors, sustitutions or concatenation)"

YMMV, but if you're aiming for a format that's edited / maintained by humans things like YAML's anchors and substitution are exactly the features I'd want...

brandonbloom · on April 14, 2016

Just the other day I commented complaining about JSON config files without comments, but now here I am complaining about _three_ ways to write a comment. OK, I can see two ways: block and line comments. But why two ways to write line comments? Why start off a new grammar with that added complexity?

binarycrusader · on April 14, 2016

I also notice everyone seems to keep silent about the fact that comments were intentionally left out of the final JSON specification to avoid abuse by parsers/vendors.

drawkbox · on April 14, 2016

Comments would be nice but it is also nice to keep JSON pure and simple. There are some other json formats that use comments like jsoncpp but really not needed.

But, if comments really are needed, another easy way to have comments is have a file that rides to the side of any json files or docs. Sometimes we use a markdown/text file next file.json -> file.json.md / file.json.txt to describe overall or a file.meta.json that has comments per key. This is only needed sometimes for physical files. If json is from the server, commenting can be done there or in docs if needed.

brandonbloom · on April 15, 2016

The notion that comments are neither simple nor really necessary is completely bonkers to me, or anyone else who has implemented a lexer or tried to debug an undocumented config file.

kleiba · on April 14, 2016

Added complexity for parser implementors, but arguably more freedom for the user. (?)

zeven7 · on April 14, 2016

But why?

kleiba · on April 15, 2016

You mean why bother having multiple ways to comment?

Well, I sometimes use both C-style (/* ... */) and C++-style (// ...) comments in the same file to sort of mark the relative importance of comments. But yeah, I guess that's not strictly necessary, I could do without.

jaybuff · on April 14, 2016

Ah, yes, the trailing comma in a list, which I like to refer to as the "Silicon Valley Comma"

[ "a", "b", "c", ] // the silicon valley comma

Retra · on April 15, 2016

Of course it looks ugly if you list things horizontally... :)

function_seven · on April 15, 2016

If that's the Silicon Valley Comma, what do you call this?

    [
       "an
     , "array"
     , "of"
     , "things"
    ]

Because I have an irrational hatred of that style. (Yes, I know the purported benefits when diffing files, I don't care :P)

cheapsteak · on April 15, 2016

That's not as good as the SVC because you'll still get unnecessary diffs when prepending to the list

SVC allows you to both prepend and append without adding extra diffs

Mikhail_Edoshin · on April 15, 2016

It's a nice thing to have when you generate JSON programmatically :)

o_____________o · on April 14, 2016

CSON?

- https://github.com/bevry/cson

- https://github.com/groupon/cson-parser

rhapsodic · on April 14, 2016

Looks like a solution in search of problem, to me. JSON is designed to be machine-readable, and to the extent that I actually need to human-read JSON, which is not that much, I don't find it all that difficult.

xaduha · on April 14, 2016

Standards are better when they are followed. Chicken and egg problem, these alternatives are DOA because they are not going to be popular.

The only reason JSON is popular is because of Javascript. And the only reason Javascript is popular is because of the browsers and their history.

wwwtyro · on April 14, 2016

I've used with great results in my procedural planet generator[1]. It's very forgiving, so made writing a "UI" with lots of complicated controls very easy for me.

[1] http://wwwtyro.github.io/planet-3d/

baq · on April 14, 2016

nice little project you've got there :)

partycoder · on April 14, 2016

JSON parsers are not really slow. JSON is simple enough that allows multiple implementations for parsers and easy adoption. But HJSON additions have some serialization cost overhead.

Because of this eventually you will need to convert your HJSON to JSON prior to deploying, and that would make things slower. You will be dealing with 2 formats instead of one.

Then, do you really believe that adding all this syntactic "features" (overhead) will make it less error prone? It will make it more error prone because it has more things to consider!

true_religion · on April 15, 2016

It's supposed to be used for config files like package.json and friends in the javascript world.

It's going to be parsed essentially once---startup.

partycoder · on April 15, 2016

In that regard I prefer HOCON.

zbjornson · on April 14, 2016

Oy, someone loves yaml...

I'm quite happy using a preprocessor like [0], which keeps the great simplicity of JSON and just allows comments.

[0] https://www.npmjs.com/package/strip-json-comments

creshal · on April 14, 2016

> Oy, someone loves yaml...

I don't, actually, I use preprocessors too¹, but since they're not always an option, I'd rather recommend yaml than have yet another pointless config file language needlessly fragment the market.

¹: Thankfully it's rather trivial in python: https://github.com/creshal/yspave/blob/master/yspave/pave.py...

DominoTree · on April 14, 2016

"Helps reduce errors" - you're really trading errors for other errors - your behavior is now more ambiguous with more edge cases, but look, you don't have to place quotes around strings! (except when you still do)

mtalantikite · on April 14, 2016

Can someone explain why people want to use a data-interchange format like JSON for configuration files, rather than using a configuration file format like TOML? I've never understood why people want to use JSON for config files.

stormbeta · on April 15, 2016

I agree about JSON, but I can't say I understand the logic behind TOML.

I know it's supposed to be a config format, but it only seems to make any sense for INI-like configs that are little more than a flat key-value map.

The places I see people using JSON/YAML/etc for config are much more likely to have nested structures that would be extremely awkward to represent in TOML. I think YAML was on the right track, and if you ignore the messier parts of the spec it works pretty well.

mtalantikite · on April 15, 2016

I was mainly using TOML as an example of a config format in contrast to JSON, I wouldn't say it's definitely the answer. Nesting is possible with TOML, but I'd agree that it could get pretty awkward depending on your needs: https://github.com/toml-lang/toml#array-of-tables for example.

I personally don't mind YAML all that much either, although the spec is pretty large.

querulous · on April 14, 2016

pretty much everything under the sun can encode and decode json, often as part of the stdlib

toml requires you track down a toml parser, at the very least

mtalantikite · on April 14, 2016

Yeah, that's fair. I've alway found json as a config format cumbersome though (like in Packer) and there are plenty of TOML parsers out there (https://github.com/toml-lang/toml#v040-compliant).

Maybe that's an argument for languages to start adding some configuration format other than XML into their standard libs.

ebbv · on April 14, 2016

This abandons a lot of principles of JSON that are there to avoid ambiguous situations. The small benefits don't seem to outweigh the snake pit you're jumping into.

kennell · on April 14, 2016

It has always bothered me that the JSON standard does not allow for comments. Especially when you want to annotate some sample response/request.

nucleardog · on April 14, 2016

They were removed because people started trying to use them to hold additional parsing directives and other meta-information which would have destroyed interoperability and defeated the entire purpose of a simple interchange format. See: https://groups.yahoo.com/neo/groups/json/conversations/topic...

If you want to annotate JSON in documentation, I say "go ahead and just use //". Any programmer reading it will understand that those lines are taken to be comments and they shouldn't type them in their final request.

andreynering · on April 14, 2016

It's just a hack, but you can write a comment in a key that is supposed to be ignored:

  {
      "__comment": "The following config does...",
      "key": "value"
  }

But I agree that it is not much intuitive.

spriggan3 · on April 14, 2016

JSON isn't semantic so there is no need to put __ before comment, it's not like it's meta or something. It will still be parsed and processed by the JSON parser, which is a waste of computer cycle.

oconnor663 · on April 14, 2016

It's kind of a future-proofing thing. If you put a field called "comment" in a JSON blob, especially one in a format you don't control, you run the risk that future versions of the format will define the "comment" field and give it actual meaning. A crazy prefix makes this at least slightly less likely.

rrauenza · on April 14, 2016

...and the lack of trailing commas.

creshal · on April 14, 2016

It's an oversight, but does it really matter anywhere? YAML can be used for a JSON-y syntax that allows comments, which is good enough for most use cases.

wnevets · on April 14, 2016

IIRC douglas crockford purposely didn't include comments

daxelrod · on April 14, 2016

They were explicitly removed: https://plus.google.com/+DouglasCrockfordEsq/posts/RK8qyGVaG...

HN discussion: https://news.ycombinator.com/item?id=3912149

wnevets · on April 14, 2016

So definitely a thought out and purposeful decision, thanks for the links.

tacone · on April 14, 2016

To be honest: relaxed formats usually bring a lot of glitches to keep in mind. It's probably easier to use stricter specifications.

Take YAML, it looks pretty natural at first sight, but has a virtually infinite list of gotchas.

crdoconnor · on April 14, 2016

>has a virtually infinite list of gotchas.

That's why I wrote this:

https://github.com/crdoconnor/dumbyaml

YAML is far better with explicit typing and flow style, tag tokens and node anchors/references removed.

fibo · on April 14, 2016

It already exists YAML. Also cson it is worth to look at.

rhinoceraptor · on April 14, 2016

I find YAML/cson very difficult to read.

jordache · on April 14, 2016

the json spec is pretty easy to follow, even for a human coder. this is overkill imo

jordache · on April 14, 2016

In fact, reading the doc for this takes longer than the reading the JSON spec.

drewm1980 · on April 15, 2016

Shouldn't we keep our format specs simple and strict, and relegate aesthetic, and typo-correction stuff to the editor?

i.e. something with aspects of clang-format (which tries hard not to change the meaning of your code even if it's broken), and the aggressive autocorrection necessary to make typing on a touchscreen work?

I suppose there are converters from this to json, though, so maybe this is just a better specified way of converting keypresses from monkeys into something with well defined structure...

overcast · on April 14, 2016

Honestly, I thought JSON was already very human readable/writable.

rymohr · on April 14, 2016

JSON is very readable/writable... to programmers. Most humans aren't programmers though.

If you showed JSON to someone on the street they could probably understand the gist of it (if pretty printed). Good luck asking them to write it.

orcasauce · on April 14, 2016

Your argument is a layman can't puzzle out JSON easily, but do you think hjson is any better? Syntactically it's core seems to just be QOL improvements over writing JSON by hand. It isn't any more intuitive from my perspective. In fact, in many cases it seems less intuitive by offering a greater number of ways to do the same thing.

rymohr · on April 15, 2016

I agree completely. If you're going for human, you need to go a lot further (which is what I've tried to do with HONEY [1]).

[1]: https://github.com/honey/honey

overcast · on April 14, 2016

I'll agree with that, but what is the use case for giving this to non-developers? Everything on this page looks like it's directed towards making JSON files slightly simpler for programmers.

rymohr · on April 15, 2016

I was referring to your comment about JSON in general. See my earlier comment [1] for more background.

[1]: https://news.ycombinator.com/item?id=11501332

ant6n · on April 14, 2016

"Trailing commas are ignored." This is the most important :p

RangerScience · on April 14, 2016

Hi! I like what you've made. I have been working on something similar, although the Github is massively out of date and was never complete to begin with: https://github.com/narfanator/YAMLite (Also, I'm renaming it nowish on the up-to-date version).

This parser handles YAML, JSON and XML. Interestingly, many of the features HJSON has, this has, by virtue of it being easier to implement during the parsing stage.

The part I'd draw your attention to - and the part that I think warrants the most discussion - is the resulting data structure. I mostly can't tell what the structure is of the HJSON C# object - it looks like it does most of what I wanted to change about the existing C# JSON parsers, but maybe not all?

kbenson · on April 14, 2016

This feels like a project that had a core idea that was good and justifiable (we'll take some of the common JSON mistakes such as extra commas and most asked for features sub as comments) and then felt the need to keep throwing in features to justify its existence, and now it's lost sight of its original goal.

This can't even be parsed natively by major JavaScript implementations, so is it really JSON at all? Actually, I think that's the root of my complaints, that it's associating itself with JSON while clearly diverging from what was important originally in JSON. At this point it's just some incompatible format leveraging the JSON name. I think most my criticisms would be ameliorated if it was just some other JSON-similar format with a different name.

aeturnum · on April 14, 2016

This looks pretty great.

We've been looking for a replacement configuration format over our ancient ini files and had rejected JSON for TOML because TOML allows comments (and man, can comments be useful in configuration files). This looks like a nice medium-long term alternative.

Gedrovits · on April 14, 2016

Congratulations, you've created something YAML-like, non-safe data structure.

sickbeard · on April 14, 2016

I don't understand. doesn't commenting ruin the whole point of a human readable format? If you have to add comments, it means you need to communicate something that can be done in more concise way.

baq · on April 14, 2016

what?

comments are the most important factor of 'human-readability' by far. without them you can't e.g. explain what a particular key does, what is it's default value (if any), or perhaps the most important thing - you can't even put a link to the documation!

sickbeard · on April 14, 2016

what does 'human-readability' mean? JSON is data-transmission format that is human readable.. which is why it happens to be more popular than XML (or SOAP) because you can read the data and see the data.

It's not meant to transmit context (i.e. it's useless and that's what documentation is for).

erez · on April 16, 2016

Human-readable json would've been a great idea, if it wasn't for the fact that json was NOT MEANT TO BE HUMAN READABLE.

It is meant to be generated by a machine and not created by hand, neither it should be readable by humans, only parse-able by a computer.

Treating your data interchange/serialization/configuration/markup formats as languages that should be human readable/writable is a cardinal sin of any person or company that engages in such practices.

jbergstroem · on April 15, 2016

Just wanted to bring libucl into the game in case you are exploring json-like syntax: https://github.com/vstakhov/libucl#improvements-to-the-json-...

In my eyes pretty much the perfect configuration library and syntax. Nginx-alike, number suffixes (1min, 2gb, ..), macros, variables, includes with priority, etc. Boom! Problem solved.

Zardoz84 · on April 15, 2016

And i should remember SDLang that now have a few years and reference implementations on Java, C# and Dlang . I don't see why we need reinvent the wheel again and again...

https://github.com/Abscissa/SDLang-D/wiki/Language-Guide

paulddraper · on April 15, 2016

> Significant whitespace is a common source of mistakes that we shouldn't have to deal with.

...except you just made it significant.

This is JSON.

    {"a":1,"b":2,"c":3}

This is Hjson after applying the mods:

    {a:1b:2c:3}

....oh, it turns out Hjson actually does have significant whitespace.

roosterjm2k2 · on April 15, 2016

Where is this drive to create dumb languages coming from? I dont mean dumb in the opinionated way, i mean, dumb in the way of "its hard to write proper code that follows rules, quoting and commas are hard" ... ... ... if quotes, commas and escaping is hard for you, you dont need to be an engineer....

herpityderp · on April 15, 2016

This looks cool, but their characterization of YAML is disingenuous

    YAML expresses structure through whitespace. Significant
    whitespace is a common source of mistakes that we
    shouldn't have to deal with.

since every code editor ever used will take care of this for you.

skybrian · on April 14, 2016

Nice idea. A nit: making trailing whitespace significant (rather than stripping it) seems like a bug.

blueadept111 · on April 14, 2016

The Jaunt JSON parser (Java) was my solution to this problem. It can handle arbitrarily dirty data, including missing quotes or using semicolons instead of quotes, missing quotes, etc.

http://jaunt-api.com

Ericson2314 · on April 15, 2016

Grammars people! If you don't provide it / promote it, I assume your "nice simple thing" is neither obvious nor thought-through. Sorry.

Help me with my list of trendy things that took or are taking way to long to get a grammar: docopt, semver...

dukoid · on April 14, 2016

There is also HUTN already: https://epsilonblog.wordpress.com/2008/09/15/new-in-hutn-071...

Murk · on April 15, 2016

Comments, readability plus typabiliy where one of the main reasons I recently chose YAML for a configuration file. It seems YAML is a bit unloved these days, perhaps because it is more difficult to parse fully.

YAML references also proved useful in my use case.

joelthelion · on April 14, 2016

This is overkill, but someone PLEASE add comments to the next iteration of the json spec.

stock_toaster · on April 15, 2016

There is also libucl[1] which is kind of json like. FreeBSD is apparently starting to use of it for a few things.

[1]: https://github.com/vstakhov/libucl

cyphar · on April 15, 2016

I thought it looked cool when they said "nginx-like". But then I remembered that nginx has annoying string semantics and converts things to arrays in odd circumstances.

anthay · on April 15, 2016

Off topic, but related: I made a simple data-serialisation file format based on S-expressions, which may be of interest to some: http://loonfile.info/

rymohr · on April 14, 2016

HONEY was my take at the same problem. Still brainstorming on this one and not used in production yet.

https://github.com/honey/honey

emodendroket · on April 14, 2016

I'm not using a nonstandard JSON extension unless it implements a standard freaking date format. I also don't think bare strings are necessarily a great idea since they lose implicit type information.

draegtun · on April 15, 2016

Here's a previous HN discussion on Hjson - https://news.ycombinator.com/item?id=8432678

dasmith91 · on April 15, 2016

So it's YAML with all the cruft of JSON thrown back in?

rurban · on April 17, 2016

Nope, just a readable JSON. YAML does much more, for many too much.

teen · on April 14, 2016

It's almost as if you're reinventing protobufs. XD

amelius · on April 14, 2016

Is "undefined" part of json? Imho it should be.

svachalek · on April 14, 2016

If the key is not present, the value is "undefined" in JS. "null" is a supported value in JSON, to explicitly mark something as nonexistent.

For me the biggest problem with JSON is lack of full floating point support, i.e. NaN, +-Infinity, -0.

amelius · on April 14, 2016

You can stringify an object containing a property with an undefined value. Imho, parsing that back should give exactly the same data structure.

curryhowardiso · on April 15, 2016

Soon: Situation: there are 15 competing standards.[1]

[1]: https://xkcd.com/927/

asb · on April 14, 2016

Also in this space, Jsonnet is well worth a look http://jsonnet.org/

wstrange · on April 14, 2016

+1

jsonnet gives all the benefits of hjson - but also provides more powerful templating features

pekk · on April 15, 2016

YAML's problem is not "significant whitespace" which really isn't a major cause of mistakes.

joejev · on April 15, 2016

so in this language these all do the same thing:

abc "abc" abc, "abc",

How does increasing the scope simplify things? Defining correct as "crashing less often" is a really bad idea, data formats _should_ be strict.

educar · on April 14, 2016

Possible bug: duplicate keys are not flagged as errors (in the demo atleast)