Hacker News new | past | comments | ask | show | jobs | submit login

What is your strategy for dealing with server failure?



My app allows for "share nothing" architecture, basically using multiple DNS A records as load balancing. Currently it has 6 servers.

Even if 5 of them would go down at the same time, the site would still work as intended (thought probably couldn't handle peak load with less than 3 or 4). If one of or two are down, nothing happens.

Also completely reinstalling a server takes around an hour.


Yes, but comment I was replying to mentioned a 48 euro budget, it is a price of a single server.


I do not have an HA setup BUT all of my proxmox vms are snapshot backuped by proxmox backup server every night to my home NAS and to my office NAS. You can use one of many storage providers. SSHFS also works. This is the cheapest and lowest administration solution I used till today. For production usage I would recommend 3-4 similar speced 28€ machines and run a replicated proxmox cluster or ceph proxmox cluster.


As long as you treat servers as cattle, you can use Hetzner's own load balancer service and then you don't have a SPOF that you manage yourself. Their LBs are advertised as redundant / fault tolerant.


I don't know if it is possible to treat "Serverbörse" servers as cattle. They are all different. I know that k8s and Docker Swarm could in theory balance load between different machines, but never tried it in practice. But I had in practice some weird glitches with different CPU/motherboards/memory.

Also, comment I was replying to mentioned a 48 euro budget, it is a price of a single server.


i think the whole point of vm/container is to enable "servers as cattle"


Yes, but "Serverbörse" machines are very non-uniform, which may be bad for load balancing. See yourself: https://www.hetzner.com/sb


Still you can get a nice server with i7 cpu 32gb memory with 20tb traffic for ~28 euro, vat incl. This is super competitive even with DO or others.


yes, it may be bad for indiscriminate/dumb load balancing. then again this only works in the firstplace if the work units represent a somewhat equivalent and small-ish workload.

once you place value on determinism (in regards to time spent on a task) you want a tightly specced distribution mechanism and/or a feedback loop to communicate busystate back to the LB.


I have 3 servers, 2/3 don't have hardware timestamping on the interface, 1 does. Makes a huge difference when it comes to NTP.


Which hetzner server have hardware timestamping on the interface?

I'd be really interested to know, since to the best of my knowledge, they don't have PTP solutions in their datacenter.


I have two of the AX41-NVMe, and only one was lucky enough to get a "Intel Corporation I210 Gigabit Network Connection (rev 03)" which has the hardware timestamping.

In the finland datacenter, it appears there is a PTP running, though the offset is ~2.33 seconds off from NTP. Chrony says its a false-ticker, but I haven't really figured out how to get it configured correctly nor have I asked Hetzner for help. I've mostly just played around with it.

I did manage to also accidentally discover a local ISP's stratum 1's server is extremely close to me, as in a few microseconds away (I accidentally put hetzner's servers as a pool instead of a server and my NTP 'discovered' the stratum 1) ... I'm not using it, but I've thought about reaching out to them to ask if I can use it if I'm very nice.


Would you care to share the IPs? (both for the Hetzner PTP one and the local ISP stratum 1)

Is the 2.33s stable? If so, after deducing the stable offset, it could still be valuable.


`/dev/ptp0` and `/dev/pps0` magically showed up and yeah, it appears to be stable. Chrony actually has it marked as a non-false ticker atm and is marked as a backup. Here's the stats:

  Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
  ==============================================================================
  PPS                        64  40   67m     -6.827     33.059    -72ms    99ms
  PTP                         6   3    20     -9.585      0.005   -119ms    10ns
  192.168.100.1               6   3   88h     -0.159      0.000    -12ms  3815ns
  192.168.100.2               6   3   29h     -0.176      0.299    -11ms  2817us
  ntp1.hetzner.de            14  10  154m     -0.003      0.006    -47us    12us
  ntp2.hetzner.de            11   6  103m     -0.001      0.005    -21us  5711ns
  ntp3.hetzner.de             9   6  137m     -0.006      0.007    -72us  8949ns
I don't feel great about sharing the discovered stratum 1's on the open internet, but my email address is in my profile.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: