Hacker News new | past | comments | ask | show | jobs | submit login

As long as you don't get into the algorithms it's pretty simple.

A hand written http parser is kind of like writing a "white-list" of what the server rejects. Since there's no algorithm backing it the only thing you can do is list out all the things you can think of or have run into that is "wrong".

Using a parser (well lexer really) like Ragel I can make something that's relaxed, but it's more of a white-list of what it accepts. The algorithm explictily says this particular set of characters in this grammar is all that I'll answer to.

If you then write the grammar so that it handles 99% of the requests you run into in the wild, you get the same relaxed quality as a hand written one, but it explicitly drops the 1% that are invalid or usually hacks.

This is also the same parser that's power a large number of web servers in multiple languages, so it's proven to work.




My mind is still boggling over how you make a simple HTTP request parser so complex sounding.


Take a look at the Mongrel/Mongrel2 Ragel grammar and compare it to a hand-written request parser. You might be surprised which is complex.


Yeah I just did. Mongrel2 looks overly complex. But then it is C...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: