Fast ECMAScript Parser – Alternative to Acorn / Esprima

capncode · on Sept 26, 2017

Twice as fast as acorn - impressive.

Looked at the parser code - it's well crafted. Works with integers instead of strings wherever possible.

The API is dead simple. Produces standard ESTree AST. Should take 10 minutes to port projects from `acorn` and `esprima` to `cherow`.

I wonder why he had Typescript target ES2015 only to use Buble to convert to ES5. Perhaps Buble produces more performant ES5 code than Typescript?

jhurliman · on Sept 26, 2017

One reason to use this workflow (not sure if it applies here) is to do tree shaking on the ES2015 output.

z3t4 · on Sept 26, 2017

Using values is faster though as the engine will not optimize on first run. So while you get good results on micro-benchmarks that runs the same code over and over again, you will not get good results in real world as you usually don't parse the same code over and over again thousands of times :P

rattray · on Sept 26, 2017

I'd be very curious to read a blog post summarizing the key differences which make `cherow` more performant.

hayyouguys · on Sept 26, 2017

I didn't know do expressions were a thing until I read this project's README.md

https://babeljs.io/docs/plugins/transform-do-expressions/

tdumitrescu · on Sept 26, 2017

NB this is still a Stage 1 proposal: https://github.com/tc39/proposals/blob/master/README.md

Still a ways away from being "a thing" in the language spec.

vjeux · on Sept 26, 2017

If someone is interested would be awesome to try and plug in as a parser for prettier.

IvanK_net · on Sept 26, 2017

I am using Acorn in https://www.Photopea.com for Scripts (File - Script). User writes a script (with JS syntax) to process a PSD document (layers etc.). Then, the script is executed in my own tiny sandboxed JS interpreter.

Cherow seems to be faster than Acorn, but it is also 30% bigger :(

alangpierce · on Sept 26, 2017

Really cool to see this type of thing. I wonder if it could possibly be compiled using AssemblyScript ( https://github.com/AssemblyScript/assemblyscript ) and executed as wasm for even more performance gains?

A similar idea that I've thought of is a much lighter-weight version of Babel for super-fast development builds when you're targeting latest Chrome or Node.js. Disabling nearly all Babel transforms doesn't speed things up much since you still need to go through the parse-transform-format process, but it seems like the small number of remaining transforms (JSX, maybe typescript/flow, maybe imports) could potentially be done without a full AST parse and format, especially if it's just for development and doesn't need to be stable enough for production builds.

userbinator · on Sept 26, 2017

Looks like recursive-descent + precedence climbing. Pretty run-of-the-mill design, but works very well.

Some of the identifiers are a bit unwieldy, however; e.g.

    parseForOrForInOrForOfStatement

I think something closer to this would be clearer

    parse_for_stmts()

since that function AFAICS parses all statements beginning with "for".

rattray · on Sept 26, 2017

Having delved into Babylon's code quite a bit, I'm honestly appreciative of the lengthier name.

rattray · on Sept 26, 2017

Self-replying, I did not mean to imply that Babylon's codebase is generally unreadable or overly terse. I actually found it surprisingly easy to work in as someone who's never worked on a parser before.

I just meant that the code for for-loops in particular is complicated, and having very specific names helps in that area.

chubot · on Sept 26, 2017

Yeah I noticed when looking at the TypeScript compiler itself that the names are ridiculously long. Probably because they edit them in an IDE like VS Code.

Does TypeScript minify your code or is that a separate step?

dgoldstein · on Sept 26, 2017

Separate. The common thing is to use uglifyjs for that.

shados · on Sept 26, 2017

Pretty much all editors worth talking about support TypeScript's language server. So even if you're using vim or whatever, you get the full featured editing experience (or at least, most of it).

That includes proper auto complete.

z3t4 · on Sept 26, 2017

It's less mental work to copy/paste the name then to remember what it does.

VeejayRampay · on Sept 26, 2017

No one really uses snake case in that context though so maybe parseForStatements()

bd82 · on Sept 26, 2017

beating Acorn & Esprima in raw performance is very very impressive.

But please consider that performance is not the only factor when building parsers.

Extensibility is a major concern. Afaik Babylon (used in babel) is a fork of Acorn partly driven by the need for greater extensibility. https://github.com/babel/babylon

Another example is espree used in ESLint https://github.com/eslint/espree According to their readme.md espree was based on Esprima and is now using Acorn. Specifically for extensibility reasons. https://github.com/eslint/espree/issues/200

z3t4 · on Sept 26, 2017

How do they compare in extensibility ?

bd82 · on Sept 26, 2017

Acorn has a plugin system: https://github.com/ternjs/acorn#plugins

But as far as I know Esprima does not: https://github.com/jquery/esprima/issues/1168

I have no idea about Cherow...