vozlt's comments

vozlt · on May 15, 2017

First of all, you can use "average of request processing times" which are not possible using other methods. Second of all this is a port of the tengine module used by Alibaba modified nginx, and I think it is more easy to use than other methods. Also when the load is to high, you can control what to do, what error/page will be shown. From a system engineering standpoint it is more easy, flexible, and appropriate to use.

viraptor · on May 15, 2017

I'm just trying to figure out tradeoffs here. The way I understand sysguard now, it will find out that the load is too high, after the load is too high. You can set the threshold lower to start limiting the traffic early on, but usage spikes will still get you.

I'm trying to compare it to alternative solution which prevents the high load in the first place (puts limit on the CPU/mem shares). By limiting the app rather than nginx itself, you can still set the error page for the case where the backend is too busy (handle 502 bad gateway).

The nice part of doing that at system level is that you can prioritise other processes over nginx, but otherwise ensure the app can take 100% CPU if nothing else is interested in using it.

highclass · on May 15, 2017

Have you personally tried other methods when having an overload of server traffic? Do you have a link showing a real setup of a better method?

Of note, the module is a port from the nginx tuned by alibaba which is quite a credible source.

I have served over a billion pageviews/month, and cutting off requests at a certain load threshold with a custom error page to users is a good idea. Otherwise the users will just get a huge slowdown or a backlog of requests which will cause a prolonged server recovery.

Also as the author vozlt mentioned, you can use "average of request processing times" and custom http errors. Also, it is much easier to use than the methods you suggest.

zzzcpan · on May 15, 2017

Think how you can limit the load, but have as few users affected, as possible. And I mean users that make requests and are quick to leave if latency feels high for them, not system users. System doesn't make limiting decisions on per request basis to be useful for this.