I can see how doing it lower level would be more efficient. Are there any script...

SwellJoe · on Dec 23, 2015

fail2ban can watch the nginx logs for throttling and/or blocking messages and add iptables rules for you.

I haven't read this all the way through, but on a cursory glance it looks reasonable:

https://easyengine.io/tutorials/nginx/fail2ban/

XorNot · on Dec 23, 2015

Urgh logwatching actively pains me these days. So much waste string parsing what was originally binary data anyway.

I'm starting to think that we need some agreement where instead of logs, we just get apps to emit a stream of protocol buffers and a format string for the messages and data.

Which does make me wonder if you couldn't LD_PRELOAD something which replaced fprintf and the like...

SwellJoe · on Dec 23, 2015

"So much waste string parsing what was originally binary data anyway."

"So much" is pretty imprecise. How much waste do you believe string parsing incurs in this case?

baudehlo · on Dec 23, 2015

You can go the systemd route if you want, or just know the fact that parsing strings works, is mostly reliable, and really isn't as much overhead as people make it out to be. How many system profiles have identified it as a problem?

zaroth · on Dec 23, 2015

Of course if you can write the filter, it must be simple enough to script changes to iptables based on recent packet statistics directly.

But often times you don't even know what domain is being connected to at the network layer. You need output from the process holding the connection key. And you want very clear separation from that task...

superuser2 · on Dec 23, 2015

Streams of plain text are what UNIX was built on. If you want binary APIs, look outside the *nix family.

Ao7bei3s · on Dec 23, 2015

Or modernize the applications. Throw out the ad-hoc formats and parsers, replace them with machine-readable equivalents.

For example, systemd finally provides a logging system that allows structured logging with key/value fields.

xorcist · on Dec 23, 2015

You can also use ulogd and iptables accounting to count hits per IP or per subnet if you want something more light weight. If you have a non-trivial system you can probably produce a netflow stream from your router which gives you the accounting without performance penalty.

SwellJoe · on Dec 23, 2015

I wonder how big of a performance penalty we're talking about here, in any case. Systems are so fast, and text processing is so cheap, I doubt anyone is going to find that tailing the log and grepping out some strings is going to impact their system in any meaningful way. It would require tremendous request rates to be notable, and the bottleneck in such a scenario would be far, far, up the stack (probably database, followed by web app, followed by web server, followed by a hundred other things on the system, with tailing the log somewhere way down at the bottom).

I'm not opposed to other ways of solving it, but I think the belief about how much resources it takes to process a text log file that people are expressing in this thread is at least a few orders of magnitude off.

rodionos · on Dec 23, 2015

That's neat, thanks for the fail2ban pointer. I wonder if there is something that would tail nginx/apache logs and compute aggregate request counts by response code, error code for monitoring/alerting. Not looking for log file collection itself, just aggregates.

hyperdunc · on Dec 23, 2015

Just setup fail2ban and it works great, thanks.

dorfsmay · on Dec 23, 2015

You can some stuff around iptables recent:

http://blog.zioup.org/2008/iptables_recent/