Hacker News new | past | comments | ask | show | jobs | submit login

probably worth noting that the default peg backend for the c combinator parser 'hammer' uses an arena allocator for speed and because it's a good match for packrat memoization. there are hammer bindings for several languages including python and java, not sure if there's a rust one yet. hammer also has some other backends including lalr, and in some cases you can write a grammar such that you can parse it with either packrat or lalr

hammer's arena allocator is not very fast, though. it's faster than any malloc i've seen, but a good arena allocator is faster than calling an empty function, so you have to inline the fast path. the open-source prototype arena allocator i wrote in http://canonical.org/~kragen/sw/dev3/kmregion.h (and .c) doesn't quite reach that level, but it takes about 1.9 nanoseconds per allocation (520 million allocations per second on one core), while glibc's malloc takes about 19 nanoseconds per allocation. more recent versions of glibc (on a faster cpu) have reduced that to 14 nanoseconds in that particular microbenchmark. details are at the link. your mileage may vary

chris wellons wrote an excellent article last year about his approach to arena allocation in c in https://nullprogram.com/blog/2023/09/27/

for those who aren't aware, garbage collectors for languages like js, c#, ocaml, or java are usually generational copying collectors for efficiency, which means they use what is effectively an arena allocator for most new allocations. often they reserve a cpu register or two for this, which my c implementation above can't

for c, the apache portable runtime 'apr' has a generic arena allocator in it called pools, which also supports destructors (the drop trait, in rust) and nested pools

the arena allocator in gcc is called 'obstacks' and got moved into libiberty so you can use it in other c programs. or hypothetically in rust




Hammer's arena allocator is, in fact, hot garbage. The entire reason it's there in the first place is that I didn't want to get into the weeds integrating a GC when building a proof of concept that PEGs could work at runtime in C with replaceable backends, so I didn't bother tuning it much beyond "4k sounds like a good number".

To me, the valuable part of hammer was the idea that you could use a friendly PEG-like API and end up with an LALR or GLR parser, and that, I believe, was a resounding sucess.

On the other hand, the generated AST turned out to be even less pleasant to work with than I imagined, and the only part of the implementation that's worth salvaging is the bit reader and writer routines.


haha! well, you're being harsher on your own work than i think it deserves

hammer works pretty well, i think, even the ast generation part. judicious application of semantic actions can give you whatever ast you want, if you're willing to stick to the peg backend anyway

i'm curious what you think about the prototype arena allocator i linked above; i wrote it originally as a candidate replacement for hammer's


obstacks are in glibc so they are very widely available https://sourceware.org/glibc/manual/latest/html_node/Obstack...


oh thanks, i didn't realize




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: