Urgh logwatching actively pains me these days. So much waste string parsing what was originally binary data anyway.
I'm starting to think that we need some agreement where instead of logs, we just get apps to emit a stream of protocol buffers and a format string for the messages and data.
Which does make me wonder if you couldn't LD_PRELOAD something which replaced fprintf and the like...
You can go the systemd route if you want, or just know the fact that parsing strings works, is mostly reliable, and really isn't as much overhead as people make it out to be. How many system profiles have identified it as a problem?
Of course if you can write the filter, it must be simple enough to script changes to iptables based on recent packet statistics directly.
But often times you don't even know what domain is being connected to at the network layer. You need output from the process holding the connection key. And you want very clear separation from that task...
You can also use ulogd and iptables accounting to count hits per IP or per subnet if you want something more light weight. If you have a non-trivial system you can probably produce a netflow stream from your router which gives you the accounting without performance penalty.
I wonder how big of a performance penalty we're talking about here, in any case. Systems are so fast, and text processing is so cheap, I doubt anyone is going to find that tailing the log and grepping out some strings is going to impact their system in any meaningful way. It would require tremendous request rates to be notable, and the bottleneck in such a scenario would be far, far, up the stack (probably database, followed by web app, followed by web server, followed by a hundred other things on the system, with tailing the log somewhere way down at the bottom).
I'm not opposed to other ways of solving it, but I think the belief about how much resources it takes to process a text log file that people are expressing in this thread is at least a few orders of magnitude off.
That's neat, thanks for the fail2ban pointer. I wonder if there is something that would tail nginx/apache logs and compute aggregate request counts by response code, error code for monitoring/alerting. Not looking for log file collection itself, just aggregates.