Hacker News new | past | comments | ask | show | jobs | submit login

Given how taxing the "thundering herd" effect can be on mirrors, websites (RSS readers!), you'd think this sort of thing should've been in cron since at least mid-90s.

Once again, OpenBSD with the simple, obvious solution, that everyone else kinda overlooked. I hope every other cron out there copies and ships this as soon as possible.




>OpenBSD with the simple, obvious solution, that everyone else kinda overlooked

Systemd Timers have had this for a while.


They said simple and obvious, which much of systemd is not.

Personally I find the OpenBSD solution far more elegant and UNIXy than the systemd one, but to each their own.


Why does the complexity of the rest of systemd matter when comparing the implementation of this feature?

What would you say is wrong with RandomizedDelaySec?


UI is too complicated to type some word concatenated keyword than being visually intuitive.


On the other hand I wouldn't be able to understand what the hell is this cron string. I actually have no idea about cron format despite the fact that I used it multiple times. I have to read man every time I use it. Also different software implements it differently.

  [Timer]
  OnCalendar=daily
  RandomizedDelaySec=12h
might take few more seconds to type, but it's definitely readable without any additional documentation.


The option says Sec but the value says h? What does that mean?


> The arguments to the directives are time spans configured in seconds. Example: "OnBootSec=50" means 50s after boot-up. The argument may also include time units. Example: "OnBootSec=5h 30min" means 5 hours and 30 minutes after boot-up.

https://www.freedesktop.org/software/systemd/man/systemd.tim...

Ie seconds if no units specified.


Sec is a standard suffix for time values, anything ending with Sec accepts a value in seconds. 12h is shorthand for 'the number of seconds in 12 hours'


It's standard to say 12h and expect this to read as 43200?

Sounds like a lousy standard if the correct way to use it is to say "I want to delay by 12 hour seconds". What is even "12 hour seconds"?


Makes sense once you know it.

"Sec" suffix indicates time and implements a default of seconds, where a suffix to the value indicates a change in unit.

It would be like Asking "Memory Allocation(MB)" but accepting "12G" for 12 GB.


In our code at work we have constants like HOUR=3600 and RESTART_TIME_SECS = 6 * HOUR. It makes sense to me. If it doesn't for you, feel free to use something else I guess.


Standard where? I have never seen this before and I’ve been reading *nix configuration files for 20+ years


Standard within systemd (i.e. consistent).


What's more simple than an ini file?


Reinventing the same features many times over in many different places, of course.


    RandomizedDelaySec=


But, see, a number of people hate systemd, so that Doesn't Count.

Even if their cron does...


Even when discussing OpenBSD somehow we end up debating systemd...


Yeah, but with systemd timers it's just another ad hoc hack.


I'm not hot for systemd but this https://www.freedesktop.org/software/systemd/man/systemd.tim... looks robust, far from ad-hoc hack.


How does it look robust? Because someone wrote a man page and there's lots of boilerplate in it?

Proper design reduces complexity. The above adds a lot of complexity, and thus it's a hack. Not the appearance.


That's not a hack though, a hack is gluing things together to fix a certain specific bug that is not easily solved because the bug is related to the design instead of a mere mistake.

Otherwise with your standard of hack anything beyond hello world and baby's first input are hacks because everything else requires boilerplate.


It's not a hack because of the boilerplate - it's a hack because it's functionality implemented in a wrong place. I think the boilerplate made you believe it's not a hack by making it look professional(ish), ie someone spend time putting a lot of lipstick on that pig, man page and all.


How is it functionality implemented in the wrong place?


freebsd cron(8) has -j for the daemon to add a random sleep of up to 60 seconds on each task. This was added in FreeBSD 5.3, committed 19 years ago. (FreeBSD 5.3 released November 6, 2004)

https://github.com/freebsd/freebsd-src/commit/f5896baf9c429c...


A random sleep of up to 60 seconds doesn't really solve the problem the OpenBSD changes do, especially when your jobs take longer than 60s.

  For example, instead of "0-59/10" in the minutes field, "0~59/10" can be used to run a command every 10 minutes where the first command starts at a random offset in the range [0,9]. The high and low numbers are optional, "~/10" can be used instead.


A random sleep of up to 10 seconds turned a 150gbit/sec spike into a 5gbit/sec spike on the Akamai bill from a newspaper app I once worked on...


That's surprising. You'd think spreading the workload start over 10 seconds would lower the size of spikes (integrated over a second) by at most a factor of 10.

But the above point is still true: many jobs take a few minutes to run. 60s of dispersion in start time is better than nothing, but you really want more.

(In this case, things are still quantized to a minute boundary, so you'd really want both).


> That's surprising. You'd think spreading the workload start over 10 seconds would lower the size of spikes (integrated over a second) by at most a factor of 10.

If the delay is at the reading side, away from Akami, through a cache, perhaps 10 concurrent requests for X would result in ten lots of data transfer as it isn't in cache yet, but 10 with a short delay is enough to prime the local cache on the first request before the rest start.

There are a number of reasons a sudden glut of activity could balloon bandwidth or CPU/memory costs more than you might expect.

Without a chunk more detail about the system in question, this is just random speculation of course.


Good points.

Thinking about this-- this is Akamai, who has historically charged for midgress. Liveness of cache could be very important.


I'm not disputing that it doesn't prevent a subset of the same class of problem. It's just a wholly incomplete solution to the OpenBSD implementation to the degree that it's disingenuous to say netbsd already implemented it.


FreeBSD, not NetBSD.

They're on different timescales. The OpenBSD start times are still quantized to the minute, I believe.

Both solutions would complement each other.


ah you're right, my bsd


That threw me for a loop when I realized the last time I used FreeBSD was in the 4.x days - on a desktop, no less. That was actually something of a glory period, at least for the hardware I had at the time... Soundblaster OSS drivers that actually did hardware mixing, the proprietary Nvidia driver that actually gave working 3D acceleration on the card I had at the time (Geforce 2 GTS maybe?) - this was at least a year or two before that driver was released for Linux. I think it even had working Java.

It was such a breath of fresh air compared to Linux at the time because it was a coherent, engineered, documented system. When you didn't always have reliable internet (and at least for me, even when I did have it was something like 128K DSL), it was a huge deal to have well written man pages, where as on the linux half the time time the man page woud just tell you to scream into the void, err, run gnu info.

This was still in the period when the GPL scared off corps.


I first ran into randomization delays in cronie (the ~ is how it's implemented also and there's a RANDOM_DELAY variable for use too), after redhat had switched to it at some point years ago. Personally, never really used it, but it's nice it's there.


Intriguing! I had no idea! Again, I wish the concept was more popular back when cron and the Internet were younger, and having it built-in and readily available goes a very long way. If someone doesn't introduce you to the concept, you have little chance of knowing better until you find yourself at the receiving end of a spike, yelling at the clouds.


I believe the philosophy was to leave it up to the service to behave sensibly, including things like have a circuit breaker, use some kind of backoff/retry, and generally be robust in the face of resource contention.

It kind of feels like this is putting the policy of "don't all go at once" into the cron mechanism, which is just starting jobs at desired times.


The lazy solution is to put something like:

sleep $((RANDOM%=60))

...or longer, at the top of the script your cron is running.


It's a neat idea to be sure but I fail to see how this will have any material impact.

Firstly, OpenBSD is a niche OS, meaning the absolute magnitude of OpenBSD cron jobs out "in the wild" is relatively low.

Second, my understanding is that this is a client-side feature. I.e. if I run a service, this feature only benefits me if a significant portion of my users opt into it.

Third, I have an unsubstantiated suspicion that cron usage relative to systemd usage is also on the decline.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: