Hacker News new | past | comments | ask | show | jobs | submit login

Interesting. I'm not familiar with the library. I'm familiar with the general programming model of parsing a document into a number of tokens and then encoding document structure into offsets between tokens.

It wouldn't have worked for Gumbo's purposes because

1. Gumbo captures a lot more information than can fit in a 64 bit token. For example, Gumbo decodes entity references; this requires that text be available in a fresh buffer because each individual character might be something different than the source text.

2. One of Gumbo's goals was to make it easy to write bindings in other languages. Most languages can bind to C structs easily, but binding to C function calls often requires a much more verbose preamble to setup args, return types, conversions, etc. (I was actually thinking of LLVM when I designed Gumbo's API, since the project it was initially for at the time was looking at LLVM as an embedded JIT. Binding to a struct that's C-formatted just requires defining a new type, but binding to a function call requires codegenning a lot of argument setup.)




It's a shame, I wish vtd-xml was a more popular library, so I could read more about it rather than have to do it myself. libxml2 seems to rule the roost. vtd-xml doesn't have a debian package and the C files gave a lot of warnings when I compiled. I don't know enough about its performance to say if the bold claims are true. The author says the Java version is a little faster than the C version, which strikes me as odd - I wonder is he basing that on long duration benchmarks.

I wasn't suggesting that you should have used the approach, I was wondering if you had used the approach. I've learned a little bit about the limits of this tokenising parser method, thanks for your reply.

EDIT: Badgar thanks for your comment, I'll search out your lecturer's work if I ever have to parse something. Shame your account seems banned or something. I looked through your history, and it was over saying you had trouble quitting weed or something stupid.

FWIW my house mate kicked his weed addiction by cutting out triggers: people, places and things that would encourage him to light one up. He had all the problems with it you list. He had to stop drinking for a while to have sufficient willpower. He resumed drinking after successfully kicking weed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: