Hacker News new | past | comments | ask | show | jobs | submit login
Fast ECMAScript Parser – Alternative to Acorn / Esprima (github.com/cherow)
96 points by cherow on Sept 25, 2017 | hide | past | favorite | 20 comments



Twice as fast as acorn - impressive.

Looked at the parser code - it's well crafted. Works with integers instead of strings wherever possible.

The API is dead simple. Produces standard ESTree AST. Should take 10 minutes to port projects from `acorn` and `esprima` to `cherow`.

I wonder why he had Typescript target ES2015 only to use Buble to convert to ES5. Perhaps Buble produces more performant ES5 code than Typescript?


One reason to use this workflow (not sure if it applies here) is to do tree shaking on the ES2015 output.


Using values is faster though as the engine will not optimize on first run. So while you get good results on micro-benchmarks that runs the same code over and over again, you will not get good results in real world as you usually don't parse the same code over and over again thousands of times :P


I'd be very curious to read a blog post summarizing the key differences which make `cherow` more performant.


I didn't know do expressions were a thing until I read this project's README.md

https://babeljs.io/docs/plugins/transform-do-expressions/


NB this is still a Stage 1 proposal: https://github.com/tc39/proposals/blob/master/README.md

Still a ways away from being "a thing" in the language spec.


If someone is interested would be awesome to try and plug in as a parser for prettier.


I am using Acorn in https://www.Photopea.com for Scripts (File - Script). User writes a script (with JS syntax) to process a PSD document (layers etc.). Then, the script is executed in my own tiny sandboxed JS interpreter.

Cherow seems to be faster than Acorn, but it is also 30% bigger :(


Really cool to see this type of thing. I wonder if it could possibly be compiled using AssemblyScript ( https://github.com/AssemblyScript/assemblyscript ) and executed as wasm for even more performance gains?

A similar idea that I've thought of is a much lighter-weight version of Babel for super-fast development builds when you're targeting latest Chrome or Node.js. Disabling nearly all Babel transforms doesn't speed things up much since you still need to go through the parse-transform-format process, but it seems like the small number of remaining transforms (JSX, maybe typescript/flow, maybe imports) could potentially be done without a full AST parse and format, especially if it's just for development and doesn't need to be stable enough for production builds.


Looks like recursive-descent + precedence climbing. Pretty run-of-the-mill design, but works very well.

Some of the identifiers are a bit unwieldy, however; e.g.

    parseForOrForInOrForOfStatement
I think something closer to this would be clearer

    parse_for_stmts()
since that function AFAICS parses all statements beginning with "for".


Having delved into Babylon's code quite a bit, I'm honestly appreciative of the lengthier name.


Self-replying, I did not mean to imply that Babylon's codebase is generally unreadable or overly terse. I actually found it surprisingly easy to work in as someone who's never worked on a parser before.

I just meant that the code for for-loops in particular is complicated, and having very specific names helps in that area.


Yeah I noticed when looking at the TypeScript compiler itself that the names are ridiculously long. Probably because they edit them in an IDE like VS Code.

Does TypeScript minify your code or is that a separate step?


Separate. The common thing is to use uglifyjs for that.


Pretty much all editors worth talking about support TypeScript's language server. So even if you're using vim or whatever, you get the full featured editing experience (or at least, most of it).

That includes proper auto complete.


It's less mental work to copy/paste the name then to remember what it does.


No one really uses snake case in that context though so maybe parseForStatements()


beating Acorn & Esprima in raw performance is very very impressive.

But please consider that performance is not the only factor when building parsers.

Extensibility is a major concern. Afaik Babylon (used in babel) is a fork of Acorn partly driven by the need for greater extensibility. https://github.com/babel/babylon

Another example is espree used in ESLint https://github.com/eslint/espree According to their readme.md espree was based on Esprima and is now using Acorn. Specifically for extensibility reasons. https://github.com/eslint/espree/issues/200


How do they compare in extensibility ?


Acorn has a plugin system: https://github.com/ternjs/acorn#plugins

But as far as I know Esprima does not: https://github.com/jquery/esprima/issues/1168

I have no idea about Cherow...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: