Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How to secure website for public launch
94 points by smarri 7 months ago | hide | past | favorite | 58 comments
Id like to launch a website project online. It has some functionality e.g. user can input search parameters to find specific locations using Goole maps API, and search based on device location. I'm concerned about my program not being robust enough from a security point of view, e.g. where user input occurs, I'm not confident I can manage malicious actors. How do you launch websites that use a front and back end, securely? I'm confident in launching static websites, reasonably experienced in python, PHP, (but only for localhost personal project work), this is really me taking the big leap from localhost projects to live. Are there third party services that give extra security, or is it a case of learning this side of software from first principles? Thanks in advance.



The biggest security hole is user input.

Just escape every input: For sql, to avoid sql injection: https://datacadamia.com/data/type/relation/sql/parameter For html, if somebody try to inject html: https://datacadamia.com/web/html/entity

You got 99% of security holes patched.

All the best


This is on point.

One other thing is to limit input frequency, only allow a certain amount of posts over some period of time. Enforce this on both the front and back-end.

A little more complex, you can set a lifetime limit per user by IP address, which won't stop a truly dedicated attacker but will definitely block most of the random web crawler scripts that find your site.


IP limiting is not so simple anymore if you want to anticipate much traffic, since services like iCloud Private Relay or Cloudflare WARP forward requests through single regional IPs. You can still do some limiting, you just might bounce some of your legitimate visitors. But for that reason alone lifetime limiting seems like a bad idea to me.


Specifically if using SQL then use prepared statements or equivalent and ensure that the SQL user account uses for queries is restricted to doing just that.


delete 99% of users, patch 99% of security holes


I feel I'm not supposed to upvote this as much as I have...


If you're not sure, put everything behind Cloudflare and don't expose your origin at all. Proxy the API requests through workers or at least shield them behind page rules.

Implement some basic rate limiting by IP so you don't get your Google Maps API DoSed. Block China and Russia altogether unless you expect customers from there (sadly, many bots & drive-by scans originate there). Sanitize your inputs, especially if you have any that will reach one of your own endpoints like for a database lookup (and look into SQL injection prevention in general). Use prepared statements in PHP if you use that for DB access. Not sure about Python.

You can read OWASP guidelines for other best practices (https://owasp.org/www-project-top-ten/) or ask ChatGPT to summarize. But realistically, Cloudflare takes care of so much that it seems a bit foolhardy to try to DIY it these days...

If it were me doing this, I wouldn't self-host anything at all, and just use managed services all the way down, including the DBs. A lot less maintenance that way, especially for solo devs. Lets you focus on the business logic instead of trying to reinvent your own secure little nano cloud. It takes serious manpower to stay on top of the latest vulnerabilities and zero-days, and IMO it's not worth spending your limited time on that when the big clouds can do it much more cheaply and much more thoroughly... it's a full-time job in and of itself, and you still probably wouldn't keep up with all the latest attacks =/

Of course you end up learning less this way because other professionals do all the hard work for you. But unless you want to become a backend/security professional yourself and REALLY dive deep into this stuff, I don't think just having basic security skills is going to do you much good anyway, since it takes all of 30 seconds to spin up a pre-hardened cloud host these days, usually for free, and they will have much more exhaustive coverage. Just my 2c.


> If you're not sure, put everything behind Cloudflare and don't expose your origin at all.

While I very much understand where this sentiment comes from. Please do not blindly recommend CF.

Cloudflare seems invisible for gullible users, but is unusable and hostile to humans.

I use a VPN to a static IP by Hetzner, not to hide my true identity. But because I have to, my current living situation has my (only available) internet running through a corporate network, packet filtering/logging and all. (Yes this is all legal and I am grateful).

But still to remain any kind of privacy I have to use a VPN. My public IP is registered directly to my full name and has not changed in 3 years.

I also try and limit the amount of unnecessary data my browser transmits.

The combination of those has CF absolutely convinced that I am a existential threat to any site they so honorably "protect".

I simply cannot use ANY site with the default CF configuration. And no, I'm not the only one. This is a very common problem among humans that don't want to share everything about them to pass a human verification.

Cloudflare is the cancer of the Internet. They protect and enable criminals, only to sell the solution later. All the while, ridiculing humans into giving up more and more data in the name of safety. They trick users with promises of "Securing the connection" when they are just matching the browser to their database to sell another page visit. The internet used to be a free and open connection to the world, cloudflare has build a panopticon of surveillance and false security and they are being praised for it.


I've seen this criticism a lot here in HN, and it's something that's always concerned me.

There's a CloudFlare "essentially off" option that I've always hoped would make a difference when it comes to that. I always set it to that when setting websites up with CloudFlare, in hopes that it makes a difference.

That way I can still make use of the CDN and all the other features of CloudFlare without actually bugging visitors.

Would you be willing to load one of my websites[0] and let me know if "essentially off" actually works for you? If it does, great, but if it doesn't, I'll at least be aware that CF is a problem no matter what setting you put it at.

[0]: https://pocketarc.com


Unfortunately, it's not so much a "blind" suggestion, but a cost-benefit thing. For many sites/businesses, Cloudflare is a conscious decision because it's worth the tradeoff to the site owner, even if it incurs a few false positives (i.e. blocks a few legitimate, privacy-conscious users).

Yes, it sucks that a few (very few, in my experience) real users might get affected, but that's outweighed by the thousands if not millions of other useless bot visits that would otherwise get through. None of the small orgs I've worked for had the time or personnel to manually filter through those otherwise... it's just too much.

That said, whenever I could, I would happily tweak the rules or make an IP whitelist exception for real users who emailed us complaining they couldn't access something because of Cloudflare, but that only ever happened one or twice as far as I can remember.

--------------

> The combination of those has CF absolutely convinced that I am a existential threat to any site they so honorably "protect".

I'm sure you know this, but CF isn't a targeted attack towards you. Your usage patterns are just different from most people's, and unfortunately gets treated as a bot because it looks like one. You can email the site operators to ask for an exception, or... frankly... probably they'd just rather lose you as a customer than deal with making the website work for you :(

If the alternative is to either spend 10x more time on securing the website manually, or loosen security such that it impacts all their other customers... it's usually a no-brainer to choose to just live with the false positives instead and deal with them on a case-by-case basis as they come in.

> Cloudflare is the cancer of the Internet. They protect and enable criminals, only to sell the solution later. All the while, ridiculing humans into giving up more and more data in the name of safety.

I think our experiences have been different in this regard. IMO they are one of the most useful service providers on the Web, not just for WAF stuff but also their excellent CDN and serverless products, etc. You don't have to agree, but they didn't become this big by offering a bad product... probably most site operators would value overall server stability more than an atypical user's needs.


Considering that you suggest managed services, what’s a good version of the cloudflare tunnels and access, with the same features except that it does not terminate the TLS?


That doesn't exist, for it to work it has to terminate TLS. You can't do something like Access without decrypting the connection.


Presumably there are other ways to tunnel encrypted traffic (SSH, VPN protocols, etc.?) that don't necessarily rely on TLS?


Those typically require custom client side code, for a website you have the requirement that a web browser must be able to connect to it using TLS. Or maybe I'm not getting what your suggestion is - Access is supposed to intercept the connection and display a custom authentication page, with requests not reaching your server at all until they are actually authenticated.


The reverse proxies sometimes support TLS pass through (see Traefik). If the reverse proxy puts an authentication page in front, sure, the TLS pass through may not work. But it could work if all you need from Cloudflare is its firewalls, restricting the IP range, hiding your IP, rate limiting, DDoS mitigation, not having to open port in internal servers, etc.


CloudFlare has some TCP proxying features, but most of what you actually get from adopting CF (or any CDN) requires decrypting traffic because most of the features depend on understanding the HTTP requests.


Sorry, I don't know. That's not a use case I'm personally familiar with. Maybe others have ideas?


A few infrastructure things:

- Serve traffic behind a load balancer that has a WAF

- Network segregation for database (separate subnets)

- Make sure you serve https and have a cert that’s valid. Redirect to https if http

- Restrict ports on LB

At some point later:

- Endpoint monitoring and threat detection

- VPC flow logging

- Execute backend as non root

- Dependency / artifact scanning

- Cloud SIEM to monitor common actions taken

- Make sure no hard coded creds. Ie, use role-base auth with cloud providers

- Reproducible infrastructure builds with infra as code

- Email domain protection

- Grab misspellings of domain names to prevent squatting


> Serve traffic behind a load balancer that has a WAF

whats the cheapest non aws way to do this? cloudflare on everything? is there another option? just trying to learn whats out there. WAF mainly protects against ddos right?


> is there another option? just trying to learn whats out there.

The cheapest option would be self-hosting something ModSecurity compatible: https://en.wikipedia.org/wiki/ModSecurity

You'd also need a ruleset, for which the OWASP one might be a starting point: https://owasp.org/www-project-modsecurity-core-rule-set/

There are also some projects like Coraza in the works: https://coraza.io/

Probably not what you're looking for if you want a cloud service to take care of everything for you, though, because of the question below (just thought that it might be useful to point out that anyone can run their own WAF if need be).

> WAF mainly protects against ddos right?

Typically WAF might be offered as a part of a larger cloud service that would include DDoS protection.

However, on its own, it is meant to filter traffic that might be harmful and attempt to exploit various vulnerabilities. A bit like an anti-virus in a sense, but for web requests. Some people argue that WAF solutions can be problematic because they encourage an attitude of "so what if there's a log4j vulnerability in the codebase, the WAF will take care of it" instead of making sure that the actual code is secure, but opinions are split there (defense in depth and the Swiss cheese model).


lovely answer, thanks so much! hope others learn too.


Is there some plug’n’play vendor that would offer most of these out of the box (like Netlify etc)?


GP has some good suggestions. For implementation of these, Cloudflare is a decent first stop - though they are a little hostile to non-vanilla internet users. Their free plan offers sensible security (SSL termination, WAF, DDOS protection) out of the box, with a straight forward UI.

Network segregation for database (separate subnets) would be a config option wherever you're hosting (AWS/Google Cloud/etc.) said database/application.


> Serve traffic behind a load balancer that has a WAF

What is a WAF?


Web Application Firewall.

It’s a feature of an LB that consolidates the actions of blocking ports except for the ones you are using, fail-fast on paths that scrapers tend to check (e.g. /wp-admin, /phpMyAdmin) so it doesn’t end up in normal request logging, set rate limits, fail-to-ban conditions, etc.


Has anyone had any luck with Coraza on HAProxy?



Web application firewall


The fact that you're concerned in the first place is a great indicator that you have already avoided the gravest and most common mistake!


* keep your software & dependencies patched

* Disable SSH access for 'root' username.

* If you're using JWTs anywhere, don't mistake them for encryption - they are not.

* Check you're only serving over https.

* Don't trust your frontend. Any security check built into the frontend is near-useless, as the user can reprogram it however they like.

* Strings is how you let the baddies in, especially if you manipulate and concatenate them. Read about SQL injection to find out more.


> * If you're using JWTs anywhere, don't mistake them for encryption - they are not.

I would love to understand the assumptions that lead to this belief. It makes negative sense?


Creating a JWT takes a key or other secret as a parameter, and the resulting token is not superficially human-readable, so it's plausible that a developer might mistake it for encryption based on the high-level "shape" of the API.


Yep. A few years ago I used my credentials in some in-house back-office app that a coworker wrote. Later I was able to see my http calls in the company-wide logging system, with my username and password 'hidden' in a jwt.


Do you mean to say that you believe jwt payloads are encrypted? They are most certainly not.


What do you mean they’re base64 encrypted


Personally I wouldn't use base64 these days. Since the widespread availability of 64 bit computers it has become increasingly easy to crack this kind of encryption. I recommend using at least base256.


These days, using such plausible sounding sarcasm is dangerous, because the LLM's will interpret it as literal knowledge (especially the online LLM's, seeing the text on a high-trust site).


Don't threaten me with a good time


encoding != encryption Totally different things


It's only secure if you ROT13 the base64


To make it quantum resistant you should rotate at least 26 times


I’m saying no person who writes JWT anything should have the belief that a JWT is by any means associated with encryption. It breaks my brain, like no where in any spec are there these claims (pun)


Use some software input fuzzer against it like SQL fuzzer etc.

Never trust your frontend data ever!

Always assume the attacker can talk to your API.

Don't do auth or login yourself. Use known libs, workflows asks.

Have unit tests to verify your endpoints need auth (valid user not just a anonymous user)


I have a similar background, and I just use a $5 a month Hostinger plan that manages the PHP server and I am quite happy with it. So it is just keeping my server side secrets in PHP in a way that makes sense.

Now, this does not allow me to say do python web-apps (that are not WASM). Hostinger has VPS for quite cheap I would consider if I needed that (if AWS lambda does not make sense, I did a python google cloud app engine for a month, https://crimede-coder.com/graphs/Dallas_Dashboard, and that was pricey, like $80 a month, whereas the WASM app is no additional cost). And I am sure there are other vendors that are similar (I am just happy with Hostinger).

So in terms of DDOS protection this is not so great, but that would not be a big deal to me. So site goes down, but I do not rack up a bill or anything.

For a google maps application, I not un-commonly see people put API keys in javascript client side (not good!) I mean it depends on what exactly you are doing, but if it is a public service that users do not sign into, just rate limiting the number of API queries in some PHP + database logic server side should be not too much work and reasonable to not rack up a surprise bill (I forget if google allows you to limit the API keys directly or if they will just rack up bills).


Read OWASP ASVS. That's a really good start, if you did everything yourself, you will find many issues even without further analysis of code.


Get someone else to manage it for you while you learn. Security is an emergent property of every part of the stack, not a separate thing you can do after the fact. Get a handle on the fundamentals, too: fundamentals of TCP/IP, HTTP/S, etc.


If you are using the Google Maps JavaScript API: https://developers.google.com/maps/api-security-best-practic...



You've got some other good advice in other replies on specific steps to take around infrastructure and software/ dependencies.

To turn the question around a bit - you've identified the possible routes of compromise/exploitation (i.e. untrusted user input). The first step to me is a threat model. Work out the "so what" of why someone would try to attack you. What would be their end-goal?

To give you a few first steps, you've mentioned using a Google Maps API, and searching based on device location. Presumably your use of the Maps API is paid, and therefore a potential motivation for an attacker is financial, coming from your use of that API. Therefore treat that (i.e. the ability to make requests using your Google Maps API key) as a "target" in your architecture.

From there, you can do things to be a less attractive target (rate limiting, limiting results shown, if you are charged per-result). You could also review your code logic to ensure that only the right kind of request can be made (i.e. that someone modifying the client-side can't trick your server into accidentally making entirely arbitrary paid maps API requests on their behalf).

At this point, you'd also want to figure out your threat model between client-side and server-side, and what is exposed where. Assuming your server-side makes the API requests to Google Maps (and if not, then you're presumably exposing your API creds to clients, which is a "stop right here, don't proceed" moment!), what is allowed to flow from client to server? Can a rogue client get your server to make an arbitrary query? Would that let them use you as a free Google Maps API broker?

Understanding the trust architecture between front and back-end is (for me at least) key, as that's the primary exposed attack surface to an end user. Open up developer tools (F12), and look around requests as you use the app. Is there anything here that you wouldn't want users to see? As attackers will definitely see that, and it will be the first place they go to look at what you are doing!

Other ways to mitigate these risks could be (if you have sufficiently constrained input sets) to implement caching to avoid the ability to rack up queries against the underlying maps API. Given you are using arbitrary user locations, that's a bit harder. If users have a session or other short to medium term identifier, you could do some smart rate limiting to detect rampant scanning of large areas by making API requests that spoof the device location to be loads of different locations.

If you follow this process, and work out what's worth attacking (your infrastructure will be one of them - even just to compromise the site, post spam, etc, as will things like any database you run), then you can begin to understand those risks, and work out where there are attack vectors, and mitigate them methodically. The OWASP top 10 guidelines are a good starting point - often the biggest issues are design mistakes, omissions of basic omissions, or flawed attempts to implement basic measures. If you have authenticated API endpoints, for example, is the authentication logic correct, and meaningful? Does it actually do what you intend, and is what you intend sufficient for the level of security you want to have?


This sounds an awful lot like analysis paralysis to me. My recommendation: just launch. You probably won’t run into any of the problems you’re worried about and, if you do, you can just patch them up.

As you launch more and spend more time dealing with users the default things to do will become second nature, and you’ll find yourself using the built in tools from AWS, DigitalOcean, CloudFlare, etc. rather than rolling them yourself.

But seriously, just launch. There’s a really good chance you won’t have any problems.


Please don't do "just launch" if you accept any user accounts or PII =/ You're responsible for their data and security too, and should at least exercise some minimum security... doesn't have to be the most secure site in the world but soooome bare effort would be appreciated.


I'm actually with traviswingo. Just launch. Chances are, no one will care about your website for quite a while. Unless you're building a product with a lot of hype around it, there's likely going to be a huge gap between launching and seeing any traffic at all. This gives you plenty of time to implement some of the great recommendations given here. But don't delay the launch for it.


There are a million bots scanning all of IPv4 space every minute looking for automated exploits. You don't need someone dedicated looking to get into trouble.


Please don't listen to this advice, this is precisely how services get pwned.


Thank you all for the advice and suggestions.


Use Django.


you can always rollback.


And get compromised again immediately after said roll-back?

Fast roll-back/restore is a useful feature for improving availability but does nothing to improve security.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: