This essentially became the laser of death the other day and lead to cascading f...

mhotchen · on March 3, 2016

We've hit 3 problems with nginx:

1. Exactly this, we had mystery double trades from our clients and it took us a long time to realise it was nginx assuming we timed out and routing traffic to the next server

2. It doesn't do health checks. When a server goes down it will send 1 out of every 8 real requests to the down server to see if it responds. Having disabled resubmitting of requests to avoid the double trade issue above this means when one of our servers is down, 1 out of every 8 requests will have an nginx proxy error which is significant when you have multiple API calls on a single page

3. This isn't something I've personally hit so can't explain the nitty gritty but it's something one of my coworkers dealt with: outlook webmail does something weird where it opens a connection with a 1GB content size, then sends data continually through that connection, sort of like a push notification hack. Nginx, instead of passing traffic straight through, will collect all data in the response until the response reaches the content size provided in the header (or until the connection is closed). I don't know if nginx is to blame for this one or not, but I do feel that when I send data through the proxy, it should go right through to the client, not be held at the proxy until more data is sent.

HAProxy also solved our issues and is now my go-to proxy. Data goes straight through, it has separate health checks, and it better adheres to HTTP standards. It can also be used for other network protocols which is a bonus.

timthorn · on March 3, 2016

Whilst Nginx doesn't do healthchecks, they are available in Nginx Plus. I do appreciate that it is a charged for product, but it has a number of strong features over and above the OSS version and of course support (who are very responsive indeed).

ende42 · on March 4, 2016

3. is the reason why NGinX is the recommended proxy in front of webapps with scarce parallelism (for example Ruby with Unicorn; see http://unicorn.bogomips.org/PHILOSOPHY.html for an explanation) when "slow clients" are to be expected. NGinX is protecting the webapp from blocked workers by slow clients and Outlook Webmail seems to behave just like one. I don't know by heart how to tune this behavior if one wants to avoid it but this property is the main reason we use NGinX.

vsl · on March 5, 2016

That's… unique - and wrong - spelling of the name. (Pet peeve of mine, people spell my app's name in all sorts of bizarre ways too.)

Matthias247 · on March 4, 2016

This sounds like something else. In the outlook case their servers they seem to use the connection as a stream (which is actually valid, although not really supported by browsers outside of the event-stream class), where the server only writes little chunks of data of a time. But the server there not hindered from writing by a slow client - it simply has not more data to write at that point of time.

skinowski · on March 3, 2016

Regarding 3, buffering behavior is highly configurable in nginx.(eg. proxy_request_buffering, proxy_buffering on/off)

orthecreedence · on March 4, 2016

Wrote a post where I ran into (and fixed) this problem with streaming uploads through nginx: http://killtheradio.net/technology/nginx-returns-error-on-fi...

kvz · on March 3, 2016

It's only as of 1.8 that you can disable buffering of incoming requests though. Just a few month iirc.

bungle · on March 3, 2016

Nginx can also be used for other protocols, see stream block.

mhotchen · on March 3, 2016

I didn't know that. Thanks for the correction.

kodablah · on March 3, 2016

"but that will also disable connection timeout retry which is not what you want!"

Why is this not what you want? Are you using the reverse proxy as a load balancer to multiple servers? Otherwise, if it's 1:1 proxy (for something like SSL termination) wouldn't having nginx fail/timeout when the server does be acceptable?

marcosdumay · on March 3, 2016

> Are you using the reverse proxy as a load balancer to multiple servers?

That's extremely likely.

Somebody with an Nginx reverse proxy is probably using it for high availability, load balancing and static files cache, probably at the same time. This is what it is good for.

fgonzag · on March 3, 2016

Using NGINX as a reverse proxy is an extremely common scenario. In fact that's what I currently run (with a support subscription), but will be evaluating moving to HAProxy if their tech dept does not provide a way to resolve this issue (which is actually a very big deal for me, and I was not aware)

owengarrett · on March 3, 2016

This is Owen from NGINX. We have a workaround for this behavior (https://gist.github.com/thresheek/2fa6479ffb7aca710493), and are tracking a separate new feature request. Please submit a support ticket or send me an email, owen@nginx.com.

fgonzag · on March 4, 2016

Thank you, I will be opening the ticket tomorrow. Regarding the gist you just posted, it seems this simply disables proxy_next_upstream for any and all non idempotent requests.

However what would really need to happen is to only disable proxy_next_upstream if data has been written or read from the backend(preferably configurable by backend or location for either of those two options). Right now you basically lose the redundancy in non-idempotent requests, and immediately return the error. Or maybe I read the configuration incorrectly.

JensRantil · on March 3, 2016

Yes, I have multiple servers behind nginx. It's very common.

bungle · on March 3, 2016

Does this help: http://nginx.org/en/docs/http/ngx_http_proxy_module.html#pro... http://nginx.org/en/docs/http/ngx_http_proxy_module.html#pro...

JensRantil · on March 3, 2016

No. Requests will still be retried.

gshulegaard · on March 3, 2016

What about `proxy_next_upstream off;`?

JensRantil · on March 3, 2016

If I temporarily bring an upstream application down for upgrade I want nginx to retry the next upstream. This is a very common scenario when doing reverse proxying. Disabling next upstream breaks this.