isn't the mongrel/thin/unicorn http parser library built using ragel (i.e. an existing parse library)?
Could you expand on why you think it's not possible to handle http with existing tools?
yes, but the generated state machine is able to recognize a regular language, so if the automaton can recognize http it would mean that it is a regular language, and existing parsing tools should be able to deal with it, or am I missing something?
regular expressions (ala cs) are equiv to finite state machines. regular expressions can't count or match ()'s. ragel allows you to mix in code within the state machine, so it is actually far, far more powerful than a finite state machine.
in http, handling things like 'Content-Length: %d' and then reading a subsequent length is a little harder. as is handling transfer-chunked encoding. http is quite fiendish in places :-)
these are 'data dependent' - the parse stream contains information (i.e length) on the subsequent tokens - although some regexps have back references, these aren't common place in parsing tools/formalisms like LR,LL,LALR,CFG or PEGs.
my point is simply that a lot of the parsing drama of late has revolved around the simple task of parsing a language, rather than parsing network protocols.
there is a larger class of parsing problems that are still to be tackled.