Maybe I missed it, but this just looks like a giant list of open source tools.
Docker, Containers, virtualization, password management, etc.
Not knocking OP, but a "guide for getting started" seems like it would be a howto for self hosting. This seems like a giant list of tools.
I only note this as I work at a place that self-hosts everything as we already have an on-prem data center. For example, we use openstack for our users to spin up VMs. If I did a guide to self hosting, I would talk briefly about the design choices that led me to openstack, and then about how to prepare for and install openstack. It seems like this guide is more like "you can use openstack for cloud". Needs more "guide".
Containers and deployment and all that jazz. This is the exact kind of cargo cult nonsense that drives people away from self hosting due to the apparent complexity. NONE of that is needed to self host a website.
Just install nginx from your repos and <html><head><title>My first website</title></head><body><H1>Wow! Hi.</H1></body></html> into a index.html and put in in your www directory. Done. And done in a far more secure, far longer lasting, far less fragile way than anything involving containers or "deployment".
You're not wrong per say, but that's not really what this guide is about. It's more about self hosting open source applications and services, not simple web pages. So I don't see how Just make an index.html and put it where your web server can serve it makes much sense here.
As someone who enjoys self hosting my own applications like Nextcloud, Ghost, Matrix/ Element and others, I think using Docker containers does actually makes sense here. I'm also a big fan of the KISS approach, as you are saying. However if you are trying to orchestrate multiple apps on a small home server or SBC, the installation and maintenance can become a huge headache very quickly. Especially if some of those apps have conflicting dependencies.
I'd still disagree. Want to share an image? Put it in ~/www/. Want to share a video? Put it in ~/www. Want to share some music? Put the .mp3 in ~/www. This core idea, that you can share files simply by using a file system, makes many of these extremely complex tools that require containerization due to their rapidly changing deps... overkill. An impediment to the actual goal of just showing people things. There's almost nothing simpler than using your mouse to drag a file into a folder or $ cp etc etc.
This entire thread reminds me of the inverse relationship between actually blogging and messing around with blogging setups, https://rakhim.org/honestly-undefined/19/
As for hosting your own Matrix homeserver, sure, that's such a heavy and over-complex protocol that changes rapidly and uses libs that change even faster. You'd want to containizer that.... and probably give it it's own computer instead of your desktop or an SBC since it's going to take a lot of system resources. Or even better, don't use protocols that are so unstable you can't run them natively.
I disagree: Using docker makes it incredibly easy to tinker without real consequences or hassle. Want to try out - say - Heimdall? Create a docker-compose.yml file, an nginx reverse proxy config, and spin up the container. Don't like it? Remove said files and container. Thought about using a wiki for a project? Same thing.
For me, containers are just abstractions similar to packages, or OOP objects, or even apps. And as long as a service fits into this view, I think docker provides a great way. But, sure, you can overdo it and end up with more complexity than less (e.g. Nginx Proxy Manager).
I completely agree with you and will take it even further.
I don't even futz with nginx anymore. You don't need to do that with tunnels. This assumes you have created a tunnel with required domain name. Very, very easy to setup.
1. Create container
2. Open Cloudflare ZeroTrust
3. Create a hostname for the new service or app; heim.example.com -> http://localhost:port
4. Save it.
5. Done. -> access the service from http://heim.example.com from anywhere in the world
5. For extra security, register the app in zerotrust with an email verification policy with emails you trust or use a token, or what ever auth you like.
Unfortunately a lot of self-hosted services seem to be heading in the absurd direction of being entirely docker based, often not even having a means of installing the thing normally. It's extremely annoying.
If I want to run a bunch services like apprise, n8n, strapi, jupyter notebooks, nocodb, appflowy and more... why wouldn't I run these in containers? Using Docker and the tooling around it makes it super easy. I don't spend weeks configuring every tiny detail just to get the darn things installed.
What you find annoying is empowering tons of people like me to run lot of software. I can try lots of software I wouldn't have bothered with due to ease of setup. I don't bork my host machine when I make mistakes. I just start delete the container and start over, with a few adjusted settings before build or compose steps.
1337 hackers can run circles around me for sure, but I have a freaking datacenter in 3 computers at my house. All of them are internet facing with Cloudflare tunnels pointed at the services I want to share or use away from home. It has never been this good.
I want to be able to consume and run code provided by other people on hardware/networks that I own.
In which case - Docker/Containers/K8s(or my preference, MicroK8s) are a fucking godsend compared to 10 years ago.
I host about 25 services, I have all the config in a single repo, and can rebuild the whole fucking thing in about 10 minutes.
Adding a new machine to take load is as easy as installing ubuntu server, adding microK8s, and running a SINGLE freaking line on the cli. Boom - done. I do this basically each time we get a new machine... the old one goes to live in the server farm.
All the backups? Easy, since the storage is my NAS, mounted in microK8s, and I can just run scheduled backups there for literally every new service, without any additional config as I add them.
It's incredible. I have done the whole "SSH into each machine and tediously configure it to run a specific application, configure my router, make sure the config files are in the right place, configure backups, worry about HDD lifespan, etc" bullshit for long enough to know that I don't EVER want to go back to that.
The people that keep talking shit about containers are still incredibly focused on the use-case of "I host my own blog" like that's all that self-hosting is. But it's not - I self host open source applications as a complete replacement for shitty, ad-driven, recurring subscription fee based SaaS companies.
Meal planning (mealie),
Calorie tracking (calorietracker),
Video streaming (jellyfin),
Google Docs/Notion/etc (bookstack/nextcloud),
Authentication (keycloak),
File sharing/storage (seafile),
Digital Library (calibre),
Home automation (openhab),
Asset tracking (snipeit),
todo lists (Taiga),
scheduling (cal)
etc...
Containers aren't there to help you host your blog. They're there to let me host your application. Configured quickly and simply.
If there's company making money through a recurring subscription based website... There is someone who has made a decent clone that I can self-host.
And these tools make that process much easier than it used to be.
Meal planning? Text file. Calorie tracking? Text file. Video streaming? Share video file with static webserver. Google Docs? Share documents with static webserver. Authentication? I don't understand what this is or why it is needed. File sharing/storage? Share with a static webserver. Digital library? Files and folders, if you want to share, share with static webserver. Home automation? I don't know what this is. Todo lists? Text file. Scheduling? Text file.
You'd be surprised what you can do with simple files and a single static webserver.
Container images that have access to an NFS mount from my NAS.
You need to be careful with your mount options here, since soft mounts can cause corruption in some cases and performance is not fantastic (although nfs v4.1 is better than nfs v3) - but overall it works and keeps backups strategies simple since everything is on the NAS.
Sadly - it's fairly slow (sqlite in particular is painfully slow when under lots of load and needs care to ensure only one machine is writing to it at a time) but right now I'm avoiding more complex setups since it's not normally a problem with just me+family using things.
Long term - I've been considering moving to iSCSI, or something more robust, but it just hasn't been worth the trouble yet.
Did make me like sqlite a whole lot less though - I opt for mariaDB/Postgres on any service that supports a remote db - they seems to do just fine. 3 years in on this setup and no corruption outside of sqlite on a soft mount (which was solved by reverting to my last backups).
As I mentioned in another comment, docker containers are fine until you have even a slightly non-standard setup or use case, like the not-so-uncommon setup of having the machine connected to a VPN, in which case docker will proceed to not work at all.
What makes running these things easy isn't the container but that the software is packaged at all and would have the same effect if it were done through a conventional package manager. So of course docker seems like an upgrade when the only other thing they're offering is a loose distribution of files you have to manually copy in and configure.
As a relative beginner to Docker at home, the biggest issue I've had is that there is a presumption that everything is HTTP. I wanted to run two apps in two Docker containers that both wanted UDP on the same port and couldn't do it. Docker lets you use multiple IPs but apparently you can only use a single port for any app on a single host.
I think the problem is essentially one of scaling. When folks talk about scaling they usually talk about "being able to do more"; but this often comes at the expense of more simple use cases. Most business software I worked at would be horrible to self-host: maybe 3 or 4 services, if not more, requires PostgreSQL, ElasticSearch, Redis, maybe some queue system, etc. In short: "scaling up" often comes at the expensive of "scaling down".
The current model for a lot of self-hosted software is "make money with a SaaS offering", which is a fine model IMO, but now you have to cater to two different scales: the small "downscale" of the self-hosted user, and the larger "upscale" one of your SaaS offering.
I actually work on one of those "open source with SaaS offering" things, and it's a real challenge to unify these two use cases because the involved tools are quite different, and I don't want to implement everything twice.
Docker can offer a solution to abstract the complexities that are required, although I agree it's not a good solution to 100% rely on it (which is why I don't do it).
Yes, I understand that there are tradeoffs there, especially for a lot of federated services, which even without SaaS have to be able to scale well.
My complaints mainly stem from how docker is often kind of finnicky, eg in a previous setup of my server, I had to put everything through a VPN, but Docker didn't play well with a VPN running on the main system, so I had to limit myself to things I could install without Docker until I had to migrate to a new install.
Another issue I ran into with migrating servers was restoring a Postgres db dump to docker, which similarly didn't have much documentation, so for that I had to resort to a traditional setup. Which made me pretty happy that they at least support traditional setups, even if they lean more towards Docker.
I know this is HN, but we're talking about self-hosting. We aren't talking about bringing work home to show we can do it to potential employers so we can get hired somewhere using those tools makes sense. Scaling is simply not a problem at all in this context.
Docker and especially docker compose finally made it easy for me to start hosting many service on my local synology. I want to write some yaml, run command and get result.
The last thing I want is to fight different versions of python, js or OS differences.
If your OS fails to package nginx properly, it's probably time to find a new package manager/OS.
On a more serious note though, both technologies have their uses. Docker can be nice for ad-hoc hardware provisioning without needing to think about much, but I think the GP's comment is still relevant. For a lot of users who are completely unfamiliar with bash/Linux, Docker will be impossible to grok. For the average HN engineer sitting at their Mac, it makes a lot more sense to get nginx/apache from Brew and tinker with it natively.
You clearly don't understand what the poster is saying.
I'm not fucking interested in "tinkering" with nginx/apache to host my own software, I'm interested in herding dozens of open source applications that replace paid/shitty ad driven SaaS companies.
The contract with docker is incredibly simple - you give me the path you're saving data to, a couple of ports to point at, and some ENV vars to provide config - I do the rest.
Boom - suddenly I can easily host literally hundreds of dollars per month worth of recurring subscription SaaS products using my own hardware and networks. All without having to see ads, or risk external data breaches, or deal with user tracking.
You clearly don't understand what I'm saying. I like Docker, and use it every day. However, it's not the tool you use to learn how self-hosting works (or how your containerized software functions). Running the Dockerized version of a program turns it into a pushbutton component, which is both convenient and abstracts over anything interesting or worth learning. Docker is a compositional layer that only works effectively when you understand what it's doing inside the container.
> Running the Dockerized version of a program turns it into a pushbutton component
Which is the point, when the goal is to self-host someone else's application.
If you're just interested in learning how networks work, or what http is, or how to write software - then sure, go tinker with apache or nginx.
If you're trying to self host, containers are great. They do exactly what you've said - applications become (mostly) pushbutton components where the contract is simple and clear (data/ports/env).
And yes - there certainly is a bit of a learning curve for containers, but I'd much rather learn containers once than have to learn the ins & outs of the 25 apps I self host. For the most part, I neither want nor care to learn how they were put together. There simply isn't enough time, and the value is low.
Again - the goal here is NOT to host a blog/website I've made. It's to self-host applications other folks have put together, and that seems to be where this reference can be valuable.
Why bother with nginx at all tho? Why take the chance of borking your ENTIRE system with software you aren't intimately familiar with? Dockerize that shit and move on. If you want to get into the weeds with some bare metal installs, have at it. There just isn't enough time for me to learn every sub-system in the entire software stack to be productive. I say, pick your poison and get after it!
For one, because Docker is a horrible experience for anyone that isn't running Linux. Bare metal installs are much better for tinkering and "getting started" than Docker is.
Yup. I remember messing around with Redhat when I was 13. I installed Apache, edited the default index.html in htdocs then went to another computer, put in my IP and it worked! I was amazed, here I was running my own website.
25+ years later and the same steps are still possible to get the same result. I very much doubt k8s and Docker and whatever fancy new technology of the month is being pushed will have the same longevity. Simple technology rules.
He is entitled to his opinion and you may be breaking the rules of HN by starting a flame war. Our technology is built upon the foundations from our predecessors. I am younger too but I do not promote ageism that is so rampant in our industry
His opinion is misinformed and ignorant of the reality of hosting open source applications.
If you want to host your own blog... sure, run a local copy of nginx or apache.
That's not what this conversation is about. We're not hosting our own websites - we're replacing crappy ad driven/subscription SaaS companies with open source applications, running on our own hardware/networks.
The use case there for containers is both obvious and compelling, and his comment shows that he has zero understanding of the space. So he's making an uninformed rant while talking about the "good ol' days" when everything was SO simple.
Will Docker, specifically, be around in 20 years? Who gives a shit.
Will the concept of an isolated & preconfigured runtime environment (containers)? Fuck yes it will.
> The use case there for containers is both obvious and compelling, and his comment shows that he has zero understanding of the space.
I think this is more reflective of you, than him.
The technologies he's discussing drive (often in a poor, bloated way) most containers, and are the underlying components. Learning to self host with containers is fine until you want to customise beyond the ENV variables, or fix a bug and not have to wait on the maintainer, etc, etc.
Containerisation _is_ a recent tech, and it sticking around in the future _is_ debatable. I'm old enough to recall Red Hat (not EL)/Fedora kickstart being the thing that will never be replaced.
The thing that likely won't go away is the underlying infrastructure (nginx, apache, etc) that are worth knowing if you self host.
You're making a lot of assumptions here. Of course containers will be a thing, I'm not arguing against that and I run plenty myself. My point was that for simple tasks, you don't need to immediately jump into a technology you might not necessarily understand or won't be around in the future. Start with the basics and understand the core concepts.
Selfhosted subreddit [1] and its wiki [2] (which happens to be truer to a "getting started" concept, rather than the HN link), is all you'll ever need for the "self" approach.
I don't think I'd recommend that subreddit to anyone who wants to get in to self hosting. They're quite gatekeepery unless you're doing things the way they want. Just look up any email self hosting there to see what I mean.
Been there for a long while, and worked through self-hosting with a lot of cooperative efforts. What some may not like, or what I don't, recommended by others, was and is - in the end - upon my decision to implement. All in all - starting from their wiki is a far better choice than the original HN article.
Hi, just asking around, I'm trying to self host on a basic Ubuntu server, I'll run some websites (behind nginx), databases (docker), cron jobs and other things on there. Normally, I don't really care about ddos, however since this is going to run on my home network (on it's own vlan separate from my vlan) and I have static IP, I'm kinda scared? I usually hosted my stuff on cloud with ddos protection included. What are some security precautions I can take to prevent this from happening?
A downside to self hosting is I don’t think you can realistically defend against a ddos without network help from a third party. It’s been a while since I thought about it but iirc you need upstream routers to cut off the traffic for you because by the time the traffic gets to your router it’s too late.
That’s what made ddos so effective when it first came on the scene. Stopping or mitigating one took a lot of time, phone calls, and coordination between service providers.
Edit: another thing, a large scale ddos on a residential IP address will for sure get the attention of your ISP. If you’re running something that may get that kind of attention check your TOS :)
Safest move would be to not actually expose your server from your public IP, but instead tunnel traffic from your home server through a cheap, public-facing VPS instance acting as a gateway host.
To fully automate, you can use a premade tool like autossh, or just replicate it using standard ssh with keepalive option + a custom sysd unit file.
So let's say you do get DDoS'd, what happens? Your internet connection is virtually unusable because it's saturated. You call your ISP and they will mitigate the attack, most likely by just giving you a new static IP.
It's not great, but it's not the end of the world either, and doesn't actually incur any risk to your data or home network. Hell, if you want to learn mitigating this yourself, open Wireshark on your WAN interface, get the attacking IPs and send out abuse reports to their respective hosts. You can clear most script-kiddie "booter" style attacks like this in a day, dismantling their botnet in the process.
The other comments you received relies on 3rd parties and/or tunneling traffic via others, which may or may not impact the performance/latency negatively (depending on what you're trying to achieve).
If you want to self-host a solution, few things to keep in mind:
- Rate-limit "heavy" (RAM, CPU, I/O) processes that can be externally triggered via HTTP calls or similar (if you have a HTTP endpoint for regenerating a cache or something like that, make sure it can only be called max once per hour, and so on)
- Setup fail2ban, and things like mod_evasive if you're using Apache (I'm sure Caddy/nginx have similar solutions)
- Only expose what you really want to with iptables
Ultimately, unless you're hosting something really popular or controversial, you don't really need to be scared about being ddosed. People don't spend resources just ddosing random stuff without a reason for it, and unless it's popular/controversial, you just won't get hit by anything but random port scans/vulnerability scans.
I hosted web sites at my house on the cheapest Comcast service from about 2002 to 2016 and never once had a problem. It's possible things have changed, but a ddos attack does cost money and I think it's mostly the large or controversial sites that get targeted.
I would still be hosting at home but where I live it's nearly impossible to get a static, routable IP.
Not sure is this is the best option today, but in a previous self hosting situation I relied on CloudFlare's DDOS protection. We survived several attacks thanks to them. One attack was a 3 days long, and CloudFlare worked fine the entire time, our regular customers were lightly impacted but could still work.
I guess it depends on where you draw the limit, but if you tunnel all your traffic via an external proxy to your "self-hosted" platform, how much can you really call that solution self-hosted?
Did you try anything else before slapping cloudflare in front of your setup?
At the time, I was running a game service, so had a fair number of script kiddies attempting whatever. Expecting them, before using Cloudflare I got a hardware firewall. With a bit of research I located the same hardware firewall used by Federal Reserve Banks. At that time, an outright purchase of that thing was $24K, but it was well worth it. It had every bell and whistle one could ask: deep pack inspection, load balancing, and a quite a bit more. When I closed, that hardware firewall sold for the same price I paid, years earlier.
Discoverability? Uh... I think you may be dragging commercial concepts into this that don't have a place here. For a personal website people discover it when you link them to it. Or discover it 4 years later after google has indexed it and it happens organically. But you don't ever have to care about "discoverability". You're not trying to get a bunch of hits, or profit, or whatever.
A personal website is something you do for fun. If this personal website is serving as your resume to get hired or sell stuff, it's not a personal website. It's more commercial/profit crap. At that point, yeah, use all the cargo cult you want. Business want to see the tools and technologies they use demonstrated.
I've been hosting superkuh.com from my home PC for 20 years. When my IP changes (maybe once per year) I just log into my registrar when I notice it (it's okay if I don't notice for a day or two, no big deal) and update it.
Self-hosting means hosting things yourself. If you don't know what you host that's a problem. Otherwise, discovery is largely irrelevant and residential IPs have no bearing on it.
For actual reachability, a lot of people use VPNs, network meshes (Tailscale, ZeroTier, Nebula, etc) or secure proxies (Cloudflare Argo Tunnels).
You can also self-host software on VPSs or cloud servers.
I have a small script ran by cron (well, systemd timers) every 15 minutes which checks the current IP via https://api.ipify.org/ and updates dns entries on digital ocean (its dns service is free) if it differs from what's there.
If someone has to deal with dynamic IP addresses and wants to use dynamic DNS as well, there's also the pretty nice ddclient package: https://ddclient.net/
I've been using it with NameCheap pretty successfully for a while now, here's an example of some of the integrations it supports: https://ddclient.net/protocols.html (sadly no DigitalOcean integration as of yet, but this might be of use to others)
It’s not particularly pretty, but here is my home-grown script for doing exactly that. It coordinates between Cloudflare DNS, my Unifi Security Gateway, and haproxy running on my home servers to keep my IPv4 and IPv6 addresses up to date everywhere.
I used to use Cloudflare's API to do this, but then we changed ISPs and now are behind CGNAT, so we don't even have a public IP. To publicly expose my web server, I created a Cloudflared tunnel, and have our DNS (which is also hosted at Cloudfare) point to the tunnel. It works quite well, and adds some security because we're no longer visible to attackers using IP address scans.
If you have a server with a public address, a machine with a private address can simply use e.g. curl to access some (even non-existant) url on the server periodically which can then be extracted from the logs by a script on the server.
I actually use this to announce my internal address, even without a web server, just by accessing a locked port on the server, where the firewall logs this access.
Or you can rent a vps for $5 per month and wireguard your internal services to the vps. Vps gets put in the DNS records and throw traefik or caddy on the vps to handle things
Some HN user months ago suggested the entry level VPS offer from ionos.com which is $2/month at the moment. I bookmarked the link because I will probably need it when I relocate, but never used them so far. Any comments from customers are welcome.
I dislike how mean and critical HN can get sometimes. The article is clearly titled "A guide", not "The guide"! The OP link builds on top of using containerization / docker. Some find it useful, some would benefit from it. Doesn't mean that embrace is "silly", "nonsense", "cult", and whatnot. If you don't prefer something, make that known, that's fine. If there's something to add or alternatives to suggest, do so. One doesn't have to shitpile on a fairly well written, referenced and organized article.
Recently I've been looking through them including the self hosted one and I suspect about 10-20% are in fact ads basically paid product placement. Usually can't even be self hosted just a paid service for an instance. That's not self hosting.
A decent chunk are not practically selfhostable. I mean not for a mere mortal. Like those guys who seem to run synapse on a rpi3. Everyone else needs dual core with 6gb RAM. Again I wonder if those people actually even used the self hosted version?
Am I still ranting? Guess I had more to say than I realised.
Silly. I've been self hosting since I degoogled. Never have I regretted not using docker. To imply that you should do this is the kind of hilariously out of touch I expect from developers :).
This guide lacks the most important tools: fail2ban and tripwire. I block on the order of 5,000 malicious requests a day (mostly recon bots) from Russian and China.
To be fair - who gives a shit about 5k requests to mostly non-existent services? Even my dinky RPi can handle thousands of times more load, and the nginx proxy for the cluster is just going to hand back an empty 404.
As long as you aren't doing silly shit (like exposing password based ssh, or running your containers as root, or failing to make backups) then they can knock all they'd like.
They'd have to first find the app you're running, then attack the application to get an exploit, then manage to find a privilege escalation from the container, all for what, exactly?
Send more requests from my network? Break a couple of my services that I don't derive revenue from? I'm just literally not worth the trouble in almost all cases.
It's like 10 minutes to recreate my whole cluster from clean images (20 if we include wiping the OS on the machines).
If you're really concerned - shove the cluster behind tailscale/cloudflare/vps.
I really wasn't bragging about load. Of course 5k requests is nothing. It was mostly to emphasize that those 5k requests would bring at least 5k pieces of malware if I didn't spend the time to stop them.
The guide was missing this which I think cheapens the guide a little to me because security can be a large portion of the cost of a SaaS service you might pay for.
> It was mostly to emphasize that those 5k requests would bring at least 5k pieces of malware if I didn't spend the time to stop them.
I feel like this is confusion on your part - do you have a service/port that they are actually making real requests against where there is risk? Ex: Password based ssh access, or something like phpmyadmin running and exposed?
Basically - if they're just hitting ssh on port 22... as long as your auth is cert based (or better yet, just not exposed publicly at all) who cares?
If they're requesting random paths for wordpress admin sites or something like phpmyadmin... again - who cares? You really don't have to do anything unless you're running those services.
I agree you should keep an eye on the logs - but mostly this isn't as big an issue as people tend to make it out to be. Proper auth on your services (ex: keycloak behind mfa) means the risk is just really, really low - and you really aren't worth the serious effort it takes.
Basically - A malicious request is NOT equivalent to malware. They can make lots of malicious requests - in practice, all of them just fail.
Not knocking OP, but a "guide for getting started" seems like it would be a howto for self hosting. This seems like a giant list of tools.
I only note this as I work at a place that self-hosts everything as we already have an on-prem data center. For example, we use openstack for our users to spin up VMs. If I did a guide to self hosting, I would talk briefly about the design choices that led me to openstack, and then about how to prepare for and install openstack. It seems like this guide is more like "you can use openstack for cloud". Needs more "guide".