The one about proxy_pass blindly forwarding syntactically malformed requests and silently failing to process the response is astonishing.
It doesn't appear to be documented. Looking through NginX documentation at http://nginx.org/en/docs/http/ngx_http_proxy_module.html I don't see anything (e.g. under proxy_hide_header) to say it's sometimes not applied, and there doesn't appear to be any option to prevent this blind forwarding.
I would never have expected the backend to receive invalid HTTP from NginX, but more importantly it's not uncommon for backends to send an extra header or two to tell NginX how to serve the response, with NginX removing those headers before serving.
How do you even handle this properly? Checking for valid HTTP might not be enough, as you need to exactly match whatever NginX's idea of valid is, rather than matching the HTTP spec.
$ telnet localhost 3000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /? XTTP/1.1
Host: 127.0.0.1
Connection: closeHTTP/1.1 400 Bad Request
Connection: close
Connection closed by foreign host.
Seems like the backend responded with a valid HTTP response even though my request was invalid. Of course that doesn't speak for all backend frameworks people would use, but it would occur to me that a well-designed backend would always speak proper HTTP even if the input isn't proper HTTP, and if the backend receives a bad HTTP request it should immediately send back a 400 (not 500) and never pass it on to the business logic. My brief test above seems to suggest that Express does indeed behave this way.
Separately, configure your backends to not spit out secret info in production mode and you should not have to actually worry about this.
> Separately, configure your backends to not spit out secret info in production mode and you should not have to actually worry about this
It's not really about secrets. That's just the example from the article. (Although, that can happen if the backend is communicating authorisation-to-serve to a cache.)
It's fairly common to use X-Accel-Redirect to have the backend tell NginX to serve a file. With a malformed request, NginX will serve the header instead of the file, revealing your internal filesystem structure to the client. Often that structure has hashes and versions in the paths. Even without those, it can be quite revealing. That's not a badly designed backend; this feature is useful and intentional.
Having that pass through without being processed is a security fail of NginX, and they should at least document it. Better, provide an option to never blindly forward in these cases, or some "if" variable to let the admin configure what they want done.
After learning about this, I expect there are plenty of sites out there which will reveal their X-Accel-Redirect paths if you ask them like this, though I can't be bothered to go looking.
The article's tiny Python example using uWSGI does. Some backends do this. Sure, Node doesn't. There are hundreds of backend frameworks widely used behind NginX though. It's not necessarily even a bug: There are legitimate use cases for responding to a not-quite-HTTP request, which is why NginX itself does it.
> a well-designed backend would always speak proper HTTP
By the same argument, a well-designed HTTP proxy would always speak proper HTTP and reject improper HTTP.
Besides, strict filtering on the backend is not enough. Unless someone has audited these, your backend's carefully implemented proper HTTP request filtering might not necessarily be exactly the same as what NginX's filtering logic pattens matches when deciding whether to forward blindly.
Sure, the "XTTP" example is obvious, but are you now confident there are no subtler variants of this surprising behaviour? Any mismatch of logic is a bit of wiggle room to sneak a valid HTTP request to your backend whose response is blindly copied back to the client.
That failure to process could not only fail to filter or handle backend headers: I wonder if it also fails to add response headers, as well as failing to add request headers the backend depends on. For example X-Forwarded-For, and internal routing headers used to pass NginX variables to the backend.
The real problem is that NginX going into "blind forwarding mode" isn't documented, isn't expected behaviour, and there doesn't appear to be any way to turn it off
When you know about it, you can operate defensively on the backend by, as you say, being extra careful. But now I'm going to have to read the NginX source code to find out what kind of careful is required. And then check every backend, and block people wanting to add new ones until audited for this issue.
If you're using it to serve static files sure, but the moment you use something like proxy_pass you're basically instructing Nginx to hand control to your backend, which therefore should also be battle-tested. Nginx isn't going to protect you from SQL injection, routes that expose access to sensitive files, and so on.
My typical configuration is to rely on Nginx for HTTPS and use an unencrypted HTTP backend listening only on 127.0.0.1 for non-static pages. I rely on the battle-tested encryption logic of Nginx but as far as invalid HTTP requests go that's all on the backend.
> the moment you use something like proxy_pass you're basically instructing Nginx to hand control to your backend
Not really. That only applies for very simple proxy_pass configurations. NginX is used as more than just a simple proxy.
The point of NginX directives like proxy_hide_header and proxy_intercept_errors it that you've configured NginX to do some extra processing.
For example, I use proxy_intercept_errors so that backends can use certain error codes to instruct NginX to relay a request on to a different backend, after the first one has made a routing or versioning decision.
I also use it in conjunction with X-Retry, X-Accel-Redirect to instruct NginX what to do next; should it replay the original request, or perform a new one given to it by the previous backend in the sequence, and relay state.
This is a mechanism for backends to cooperate, and it also helps provide zero-downtime backend upgrades and routing to different hosts, invisible to the client.
Those aren't robustness failures by the backends. They aren't lack of battle-testing. The backends are fine; but you still get the wrong result sent to the client in cases where NginX doesn't both to process the responses. The problem is this behaviour of NginX was not documented and there doesn't appear to be a way to tell it to do something else.
It can be avoided with code in all the backends, but that code won't come from any specification. It can only be written after reading the NginX C source code to find under what conditions it does this, and have the backends go into a special mode when they detect it, responding with an error differently than they would to other bad request errors.
Now I'm wondering if the headers X-Real-IP, X-Forwarded-For and X-Forwarded-Proto are reliably filtered out of the client request before being set or not set by the proxy itself. This is not something a backend can do for itself, it's intrinsic to proxying.
most application servers don't want to be the front for browsers users because they don't handle a large number of 'slow requests' well. nginx will deal with lots of slow clients and only pass to backend once it has the full request.
security of an application is much more complex than just throwing nginx infront.
I realize that, at best, this is only tangentially related to security, but nginx's logging is quite frustrating. It'll log something that's completely out of your control (like invalid SSL requests) as a [crit]
You end up having patterns in your log ingestion to drop errors. Or, and this is the security concern, you start to ignore nginx errors.
This reminds me of a similar issue in Apache: it will log 500 “internal server error” when (if I recall correctly) clients close connection before SSL handshake is complete.
Quite frustrating to try to figure out where your application is crashing only to find out there’s no bug and it’s only someone running a port scan or something.
A common pit I fell into is the "inheritance" of add_header. If you use a single add_header at a lower level all add_headers at the higher levels are ignored for that server or location.
As some headers have security implications this is an easy way to shoot yourself in the foot.
Another security related point is the suppression of the server version. While nginx can omit the version number out-of-the-box, you unfortunately need an extension to remove the header completely.
You just have to use ngx_headers_more instead of the built-in headers module.
That add_header would get fixed (as a sibling comment states it should) is unlikely as it is intended to work that way:
> There could be several add_header directives. These directives are inherited from the previous configuration level if and only if there are no add_header directives defined on the current level.
It really is too bad that the functionality provided by ngx_headers_more isn't available out of the box, since it makes it a pain to use nginx on distributions that don't package it.
It's intentional, and probably common to all "list" manipulating directives. It's the same behavior you get with fastcgi_param and similar, for example.
This is a good example of the trade off between pretty/terse/clever and safe/correct/maintainable. Nginx is well-respected mature software but it's hard to see these issues as anything other than a design blunder. The trailing-slash path traversal thing looks to me like a file system analogue of SQL injection.
I think d3.js is another example of this. It's obviously written with incredible skill but I could never get on with the ultra declarative and implicit style, it always felt like a fight.
These days there seems to be a trend towards a verbose, explicit style, e.g. Zig (no hidden control flow - compare to C++'s operator overload-fest) and Go.
I think it's more intentional than a blunder. I've seen products willfully resist security features if they don't "fit cleanly" into the "original design". Same for OSS tools where there's thousand of people asking for some useful feature, but is rejected because it goes against some "design principles". (Or because there's no ROI for the company developing the feature)
Well there is a benefit in being able to read code and know exactly what's going on. For me, reading words requires less mental effort that mentally parsing glyphs.
Early mathematicians used words instead of symbols
> To determine two quantities from their difference and product, multiply the product by four, then add the square of the difference and take the square root. Write this result down in two slots. Increase the first slot by the difference and decrease the second by the difference. Cut each slot in half to obtain the values of the two quantities.
That was a lot of words for something that (I feel) is easily expressed with symbols. It took me a minutes or two to figure out what you meant by “slot”
When it comes to math at least, I think the key is finding the balance between symbolic and lexical representation. Some ideas are far to "wordy" to not use symbols. Some ideas have far to much depth to just explain with symbols. But using words to explain symbols? Very good in my experience during my math degree.
A lot of these look like not-so-great design choices in the way nginx is configured and how it handles paths.
Sometimes the behavior that leads to security problems here may be desirable, but it probably shouldn't be the default.
For instance "location /api {" probably shouldn't match "/api../" by default. Instead it should be treated like a file system would. The "prefix" matching should be
a different configuration option like "prefix /api {".
It's one of the shortcomings that I've come to loathe in nginx's declarative configuration language (and also other software products - Apache httpd is just as guilty). Everything just looks so innocent - but the devil is often in the details. So much implied nuance that you have to keep in mind when reading and especially when writing it.
Sure it's expressive and also convenient (the latter at least as long as your configuration stays relatively simple), but something like varnish's imperative VCL that offers very little built-in magic sure is easier to reason about. I have come to consider that a feature.
That just seems like an even greater nightmare to me. Soon you would have to learn to read and understand a custom program in a Turing-complete language for each and every installation.
The proper solution is a DSL, just a better DSl. Or perhaps a DSL embedded in something like dhall <https://dhall-lang.org/>, but definitely not a general-purpose programming language.
Tcl is meant to be embedded for scripting and it can be made non-Turing complete by stripping away all but the bare minimum of commands. There is a built in mechanism for sandboxing untrusted input. If can easily be a safe, bullet proof configuration language.
I was gonna consider including Dhall in my list, but then again nginx configuration has support for if statements.
And from my past experience with HCL, sometimes a proper embedded programming language is better than whatever crazy DSL some developers can envision (see for loops in HCL)
That link is interesting for two reasons. First because of the feature you mentioned, and second because there is apparently a community lua module (nginx module implementation must be interesting if you can just attach a deeply embedded language via a module)
Somehow this didn't pop up on our search results (a couple of years back) when we were dealing with some tricky redirection patterns we had to implement. Would have made things much easier.
And there are other questionable design choices in this project too. by pet-peeve is an omission of `.htaccess`-like mechanism. There is even a page they have dedicated to this [0], where instead of looking at their users' problems and finding a suitable solution (like, only loading `.htaccess` every minute or when it changes), they argue that users don't actually have a use-case where they want to allow some limited configuration to 3rd parties.
Someone even wrote a plugin that fixes that [1], but it is annoying (to say the least) that this is an option nginx developers say is "not needed" and "shouldn't be used".
Agree. Nginx is probably a thousand times more powerful than lighttpd, but while setting up php with fastcgi on the latter was straight forward and easy to understand, with nginx you need a convoluted mess of includes, location directives, setting variables, then handling 404 is broken, there are ten different tutorials on how to do this and nine of them are wrong or open security holes and then you feel like a complete idiot.
If someone does /api/../whatever, does nginx normalize that away automatically? Otherwise it seems like you could just do the attack directly (yes most clients wont let you make such a request, but that is easy to work around)
Most servers/reverse proxies need 10s of options to work more or less well. With Caddy, "correct" is the default, including having the best SSL management system (so you don't even need certbot) I've seen, and using HTTPS by default. It's true that it has some things missing (rate-limitng and weighted load balancing to name a few) that you can do in Nginx/Traefik/etc, but it's 100% worth it. Caddy also has a great extension system, so those things could easily be created as extensions.
I downvoted you because changing technology doesn’t inherently solve the concern of security. The other things your mentioned as strengths seem relatively equivalent to other web servers.
Using Caddy _does_ solve the problem of "Common Nginx misconfigurations that leave your web server open to attack". Currently Caddy's defaults are secure and you don't need to worry about fiddling with settings to keep it that way
This confabulated configuration syntax is why I discontinued using NGINX.
This selection of NGINX came after a frustrated debugging session of Apache .htaccess as well.
furthermore, unlike Apache specific IP port assignment capability, I once had to jerry-rig a dynamic configuration to tie NGINX to just one dynamic IP port out of many.
Sorry, I’ve gone lighttpd and haven’t looked back since.
I've been running Nginx for more then 10 years now... Is there anything "new"? I know about serverless, but anything else that makes the webserver part easier and safe?
It doesn't appear to be documented. Looking through NginX documentation at http://nginx.org/en/docs/http/ngx_http_proxy_module.html I don't see anything (e.g. under proxy_hide_header) to say it's sometimes not applied, and there doesn't appear to be any option to prevent this blind forwarding.
I would never have expected the backend to receive invalid HTTP from NginX, but more importantly it's not uncommon for backends to send an extra header or two to tell NginX how to serve the response, with NginX removing those headers before serving.
How do you even handle this properly? Checking for valid HTTP might not be enough, as you need to exactly match whatever NginX's idea of valid is, rather than matching the HTTP spec.