Hacker News new | past | comments | ask | show | jobs | submit login
Fastest JSON parser in the world is a D project? (dlang.org)
97 points by micaeloliveira on Oct 22, 2015 | hide | past | favorite | 70 comments




I still don’t get it. ELI5?


The JSON looks like this:

    {"coordinates": [
       {"x": 0.65, "y": 0.23, "z": 0.91, "name": "fwgzd", "opts": {'1': [1, true]},
       {"x": 0.45, "y": 0.78, "z": 0.22, "name": "alfsj", "opts": {'1': [1, true]},
       ...
    ],
     "info": "some info"}
The benchmark code [1] (very readable) is reading an array of structs containing x,y,z from 'coordinates'.

[1] https://github.com/kostya/benchmarks/blob/master/json/test_f...

I haven't read the code but the algorithm would look at this roughly like:

   See `{` process? yes
   See `"coordinates"` process? yes
   See `[` process? yes
   See `{` process? yes
   See `"x"` process? yes => new Coord, Coord.x = 0.65
   See `"y"` process? yes => Coord.y = 0.23
   See `"z"` process? yes => Coord.z = 0.91
   See `"name"` process? no, next token is `"`, skip to `"`
   See `"opts"` process? no, next token is `{`, skip to `}`
The tradeoff is that he's completely ignoring the contents of `name` or `opts` or `info` and those values could potentially be invalid JSON but this processor doesn't care.

The code is also picking up efficiencies from being a static C-like language. The "new Coord" isn't actually doing anything, the alloc happened for the array as a whole so the assignments just write a known size value to a known offset from the start of the array. He's also using SIMD instructions to process multiple bytes at a time and some other tricks but the skipping is the main difference.

I think it's also interesting that the Rust code implemented value skipping in the benchmark file itself. The relative slowness there is likely that the library used (serde_json) is the JSON plugin for a generic serialization/deserialization lib and that Rust doesn't have a way to do SIMD yet.


I wrote serde_json, and wrote the rust benchmark here a few months ago. Interestingly when I wrote this benchmark, I had my implementation as equivalent to RapidJSON on my Mac, but for some reason Kostya couldn't replicate it:

Ohttps://github.com/kostya/benchmarks/pull/44

I'm guessing gcc just has some optimizations llvm doesn't.

Rust does have some experimental SIMD, but I'm not using it yet because I want the serde libraries to be safe to use on byte streams, and reading 16 bytes ahead could block if at the end of a socket stream. Hopefully we will get specialization soon, which would let me use SIMD when I know I have at least X bytes in a buffer.


I enjoyed your blog series on serde perf.

One thing I noticed in this example was that the D example worked pretty much exactly like I want serde to work in that it was able to deserialize a subset of the overall document and the Coord struct didn't need to exhaustively cover the individual json data objects. If there's a way to do this in serde, an example in the docs would be really helpful.


Thanks! I need to get back into writing on it.

I just pushed up a rust pull parser version here: https://github.com/kostya/benchmarks/pull/54. Is that what you were thinking of?


My wishes are much more prosaic. It's not clear to me just from reading your docs how I can extract data from a JSON file using the pattern this benchmark shows (top level key containing the data, other keys containing metadata about the request) without having to create otherwise useless struct to cover the outer wrapper object.

I see you have a reply to Gankro for a non-exhaustive flag and that'd work. As for the default, the current behavior is what I'd expect from a Rust lib given the correctness first mindset of the community but I will always be opting for non-exhaustive because I think most people providing JSON APIs consider additional keys to be backwards compatible (they are in dynamic languages) and I'd prefer my apps to not break in production for no apparent reason.


I'm pretty sure this is the only way that Serde works?

If you tell it do deserialize to a `Point { x: u32, y: u32 }`, it will ignore any additional fields.


It doesn't quite yet. By default it errors on unknown fields, and my plan is to add an annotation to ignore it instead. Not quite sure if I got the default behavior correct though. I'm considering flipping it.


Huh. I tested it out on serde-toml; does it have a bug, then?


The parser finds the start/end of sub-structures and doesn't necessarily process them all. So for large structures of which you only need a subset, you're only doing the work you need to do. For a large structure in which you need all of the data, there is probably less of gain.


That plus the parsing is done in SIMD so even in UTF-8 you're processing 4 unicode points per CPU cycle [1], instead of 1 in most traditional parsers.

[1] This is a board generalization not necessarily true for all SIMD opcodes.


Hm. I guess when thinking "SSE parsing" I didn't go with 4/8-wide parsing. I was thinking that they'd be grabbing 16/32-bytes, doing a compare against a fixed constant literal of 16/32-copies of, say, '{', or '}', then extracting the index of the match, exactly.

Something like this:

    d := _mm_loadu_ps <json>
    b := _mm_loadu_ps <token: '{'>
    n := _mm_cmpistrc(d, b)
You'd have to be clever to skip around strings ("... \"sonuva ... "), but once that's handled, you'd have significant speed ups to scan for ',', '{', '}', etc.

I think the double-quote escape might look something like this:

    d := _mm_loadu_ps <json>
    q := _mm_loadu_ps <token: '"'>
    e := _mm_loadu_ps <token: '\\'>
    n := _mm_cmpistrc(d, q)
    m := _mm_cmpistrc(d, e)
    if m+1 == n:
        branch-to-top
    ... process ...
Looks like cmpistrc has 1/2 reciprocal throughput. If you unrolled the loop 8 deep, you're probably looking at 10c per 16bytes scanned.


Sounds to me like VW or Nvidia:

(Why so fast)

"On the downside I did not validate the unused side-structures. I think it is not necessary to validate data you are not using. So basically I only scan them so much as to find where they end. Granted it is a bit of optimization for a benchmark, but is actually handy in real-life as well."


Not really, because the only people that benefited from VW were Vw managers, whereas as a user I am quite happy to trade off speed for not validating useless fields when processing terabytes of JSON. And he is quite open about the tradeoff, so it's a perfect analogy except for what's different!


What was the nvidia thing? I recall http://techreport.com/review/3089/how-ati-drivers-optimize-q... , but that was ATI.



So, how is it going with D? Last time I gave it a look I really liked it, but I sadly left the place when I hit the multiple standard libraries problems.

Anyone using D in real life, among hackers here?


I started working with it for DSP and game code recently. It hasn't posed any major issues yet. I am getting what I wanted out of the language - something that is more modern than C or C++, but retains much of the root lineage. There is a lot of room to configure things to your liking and disable things you don't want.

The standard library forking issue is far in the past now - which doesn't mean that it's as complete and comprehensive as it could be yet - for an example of one I ran into the other day, the "pure" annotation is absent from some math functions because they call out to C standard libraries which use global state for error codes.

But the things that are the focus now are mainly "nice-to-have" technologies that will be good for productivity - check out "std.experimental.allocator" for an idea of what's cooking. There are also multiple implementations of the compiler tech rolling around now, not just the Digital Mars one, which is a good sign for future quality.


I never use anything else anymore for personal projects, unless I absolutely have to due to missing a certain library. A well-known phenomenon in the D community is that it "spoils" you such that when you have to use another language all you can think about is how much easier it would be to do your task in D, and how you miss certain features from D (such as the phenomenally better template syntax compared to C++). I've experienced this many times, such as when writing a large-ish Python script, or when working on a mobile game in C# + Unity. I really wish I could use D at work.


Using D for most of my personal projects, which are almost all pet game engines. Its a real joy to use. I get to write lighting fast code with the productivity I'm used to with high-level languages. Its really the best of both worlds.

I couldn't dream of writing all of the features I now have in another systems language because of all the extra scaffolding it would require, and I couldn't hope to get anywhere near the performance I have in higher-level languages because I'd lose value types and manual memory management.

The multiple stdlib problem has been solved for years now.



Actually Facebook is backing away from D. Not because they dislike it, but simply to converge on C++.


You work for facebook? They are rewriting w0rp in C++?


Sociomantic is using it for their systems, https://www.sociomantic.com/jobs/d-software-developer/


Last time I seen one of their ads, they emphasized that their code base is still D1. I don't know whether that changed since then.


they have since moved to D2. One of their developers gave a talk about the Big Migration in D conf.

edit: They have series of blog posts documenting the same too - https://www.sociomantic.com/search/tag/dlang (6 so far).


Hi. I only joined hacker news relatively recently, but I have been programmimg since 1983. My first 'open source' contribution was to Tom Jenning's bulletin board routing algorithm in 1989.

I'm in the hedge fund world, and my day job is investing but I use technology to help me do that. Andy Smith gave a talk on using D at a 20bn+ hedge fund at dconf. My background is at similar large funds, but I am now using D to develop some tools to help the investment process at a smaller but decent sized fund. A couple of D people will be helping me. So it's ready for real work, and the combination of high productivity with efficiency and correctness is a killer feature for my problem set. Fast compilation is also important as its a dynamic environment and you want to iterate quickly.


That must have been years ago. Since version 2.0 there is only one standard library.

Using it for small personal projects. Happy. :)


I'm using D (gdc) for all my personnal and professional projects. At the moment, the only reason I have to sometimes regret C++ is Emscripten, which only accepts C and C++ as input (although it might change one day thanks to ldc compiler).

Everything else works: calling C functions, ctags, syntax highlighting, automatic make dependency generation, integration in Visual Studio, step by step debugging, profiling (oprofile), valgrind, etc. As a bonus, D makes a fantastic "scripting" language (= no explicit compilation step): at work, we're progressively replacing all of our bash scripts with D scripts.


I use it for econometrics. I write some parts in R and as much as I want in D. I create a dynamic library of the D code and call those functions trivially from R.


I don't read D, but here's the code anyway: https://github.com/mleise/fast/blob/master/source/fast/json.....


On another note, I see Scala is doing really poorly in the JSON benchmark test: https://github.com/kostya/benchmarks#json


Those aren't very fair to the dynamic languages and JIT compiled languages (this includes Scala).

* For the dynamic languages execution time includes the time it takes to lex, parse and interpret the source code.

* For language implementations with a JIT execution time includes the time the JIT takes to properly optimise hot code paths. Generally you start benchmarking after a warm up period in such cases.

The only fair comparisons are those between ahead of time compiled languages.


The benchmark suited used the STDLIB json parser which is slow as hell. How slow is it? In 2011 there was a proposal to deprecate it, which never happened [1].

Why? STDLIB parser loads the file line-by-line, then copies line by line into a new buffer that joins the strings together. Then the parser is called. The Scala community normally circumvents by using non-standard community developed solutions.

Furthermore JIT vs Compiled language benchmarks are pretty unfair to JIT'd languages. Especially for the JVM which doesn't start to compile sections until >10,000 calls.

[1] https://groups.google.com/forum/m/#!msg/scala-user/P7-8PEUUj...


Well that makes the result pretty much irrelevant as nobody actually uses that lib for any serious JSON parsing.


So which is the go to library for serious parsing in Scala? And how would you change the benchmark to make it fair ?


I'd use Jackson.


No one uses the built-in Scala JSON parser. Checkout jawn[0] or spray-json[1].

[0] https://github.com/non/jawn [1] https://github.com/spray/spray-json


Although D may have a fast json parser, this benchmark is a horrible comparison. Really need a better way to do cross language comparisons.


The repo owner is accepting pull requests for the benchmarks so at least it's fixable.


It's a lot of work to fix this benchmark, which is fairly contrived to begin with. To start, have to adjust every sample to accept a warmup time, then run time (which is likely multiple samples), measuring results in that run time, both speed and performance. And you also have to be careful that the compiler is then not optimizing out the repetitions in the runtime, while still allowing optimizations that would produce the best performance.


If I had a nickel for every time I've seen a language benchmark be a very specialized contrived problem (in this case specific JSON with specific access pattern) I'd have a lot of nickels.


> Yep, that's right. stdx.data.json's pull parser finally beats the dynamic languages with native efficiency. (I used the default options here that provide you with an Exception and line number on errors.)

Nice to see that it only took a couple of years for the D community to beat in speed the scripting languages.


It's literally the next paragraph that extends this:

"A few days ago I decided to get some practical use out of my pet project 'fast' by implementing a JSON parser myself, that could rival even the by then fastest JSON parser, RapidJSON. The result can be seen in the benchmark results right now:

https://github.com/kostya/benchmarks#json

fast: 0.34s, 226.7Mb (GDC) RapidJSON: 0.79s, 687.1Mb (GCC)"


I read it, so? What's wrong in grandparent?


Supplant "it only took a couple of years to beat scripting languages" with "it only took a couple of years to be the fastest JSON library inclusive of scripting and compiled languages" and I imagine you'll understand.


Python's json parser is written in C.


The D people laugh at the dynamic type languages, yet they have "auto" in-front of all variables ;)


To give a little more detail on the difference between type inference and dynamic typing, the following code is valid in a dynamic language, but a static language rejects it:

    auto x = 15;
    if (some_condition) {
        x = "Hello world!";
    }
    print(x);
And this solves real errors. I believe there's even a function somewhere in the Python standard library that, if it finds one result, returns a string, and if it finds multiple results, returns a list of strings. Of course a string is itself iterable, so duck-typing goes horribly wrong.


I'm OK with dynamic types, they buy you something, cost you something. My big complaint is weakly typed languages (I'm looking at you, JavaScript). I've just fired SHIFT+CTRL+i and typed the line below in the JavaScript console:

    > (1 + "1") * 2
    22
At least it raises a TypeError in Python (unsupported operand type(s) for +: 'int' and 'str'). OK, probably better to fail at compile time, but I would rather fail at runtime than not fail at all.


In the early days of my JavaScript'n I always used minus instead of plus when doing addition. But as the language and tools has matured I hardly see any type errors any more. I do have a habit of parseInt() that won't easily go away though :P

Besides the concatenation of numbers to strings and vice versa, most other type errors in JavaScript will give you a syntax error or something like NaN (Not a Number).

So don't mix strings and numbers. Everything else is just "objects". =)

One thing that causes lots of bugs though is undefined object properties. But I'm not sure if inferred types would help in those cases.


Auto is just a keyword to invoke/assist type inference in the parser -- the objects are still statically typed. C++ 11x (or however it's called) uses the same.

You were probably thinking of something like the "dynamic" type that C# has.


Since Jaakko, my advisor, was one of the core authors for 'auto' in C++11: the use of 'auto' for type inference has been in C since the '70s. C++11 just generalized the feature.


Huh? Maybe it's a joke, but "auto" in C has nothing to do with auto in C++.

In C auto just defines the storage location (and is kinda useless), not the type (which you still need to define).


Kind of serious-kind of joking:

    auto x = 3;
"Infers" 'int'. Everything else is an error.


And apparently, you have no idea what type inference is. ;)


Infered type is different than dynamic type.


> Yep, that's right. stdx.data.json's pull parser finally beats the dynamic languages with native efficiency.

'dynamic' is the key word here. The title is misleading.


You may want to read past that part. The author was just pointing out that only around DMD 2.067 (that's March 2015 or later) did D finally get a JSON library which did genuinely better than those in dynamic language stdlibs, which hadn't been the case beforehand (the stdlib DMD 2.067 rivaling but not being better than Python's json on the author's system).

That's setup for the reveal of "fast" which not only does better than dynamic language stdlibs and than the previous fastest D library but does better than any other JSON parser.

The point of the historical recap is how fast things improved for the D ecosystem: in 7 months the best-case option went from parsing JSON 2~3 times slower than Ruby to parsing it in half the time of RapidJSON, a 2 orders of magnitude improvement in speed.


This post is about the fast project, which is claimed to beat RapidJson, a very fast C++ json parser.


If it was implemented in c++ using identical algorithms it would be no slower.


That's hard to know. It's always possible that one compiler generates faster code than another one for the same algorithm.

I wouldn't be surprised if most of the speed difference was down to correctness checks though.


There are full threads on why this impl is faster. It's cheating



The fast Python json libraries are all C extensions, though. The only "dynamic" thing they have to do is create the actual Python objects when they're done.


Most of python code that takes time has C/asm extensions. It's part of the 'glue code' nature of python. Right tool for the job and all that.


I don't know but the benchmarks are aweful, since scala and python using their built in Json Parsers and not the fastest parsers available and he is comparing some other super fast libraries against D and says "hoho" it's so fast.

Benchmarks are for the poor people who can't live in the real world.


Benchmarks would be fine if they could somehow actually compare performance between languages. I have yet to see any benchmarks between languages that are even close to being results that would actually be seen.


A strange way of putting it. He started out using the standard library for D with dmd compiler, and then people submitted pull requests to give better information about choices and he incorporated these.

Anyone else is free to do the same.

If you know of a faster parser in any language, I would love to hear, because getting the job done well matters more to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: