Yeah how come people running massive computers didn’t notice the limit?

Zetobal · on Nov 14, 2023

Because it's a misleading headline and doesn't mean linux only used 8 cores for computing for 15 years.

archi42 · on Nov 14, 2023

Because the framing is wrong and click bait... As anyone with many cores can tell you: Those are used.

The issue us more subtle: "[the minimum granularity] is supposed to allow tasks to run for a minimum amount of [3ms] when the system is overloaded".

That's supposed to scale with number of cores, but the scaling us limited to 8 cores. However, imho that's not even necessarily a bad thing. It's a trade off between responsiveness and throughput in overload situations. You don't want slices to become too tiny/large...

blkhawk · on Nov 14, 2023

If i skimmed this correctly then its a malus on performance not a complete cliff. I guess people just thought "hohum - there gotta be some overhead in scheduling".

throwbadubadu · on Nov 14, 2023

Yes, when you run in parallel, and e.g. see all >8 cores nicely nagging up to 100% why assume something wrong?

Still don't get after rereading the article, what is the malus, it must be small by that? Because you definitely saw linear scaling with parallelizable problems on >8 cores, otherwise people would have noticed?

masklinn · on Nov 14, 2023

What happens is that the min_slice stops scaling up above 8 cores. The article misrepresents that as “limit to 8 cores”, but my admittedly shallow understanding is that min_slice is a preemption protection: if the kernel has tasks to schedule and no free core, it will try to preempt an existing task, a process within its min_slice is protected from that preemption.

So this is only relevant for an overloaded system, and furthermore just means that processes may be preempted after 3ms (instead of that protection delay keeping on increasing), ignoring all other tunables e.g. priority and stuff.

Not only that but it’s a log2, so if this was relaxed on a 128 cores system you’d have a preemption delay of 7ms instead. I don’t think that would save you if you’re overloading a 128 cores system honestly, although it does beg the question as to why the kernel devs felt the need to cap the scaling. Even assuming it scales per thread, and you have a dual socket epyc, log2(512) = 9.

throwbadubadu · on Nov 14, 2023

Ok, thanks a lot. Agreed, one might even argue if it is better to scale this number with the amount of cores, or in what relationship, or if the initial values chosen are the optimal ones, certainly not for every situation. With the right comment one might have even claimed it intentional, lol.

Clickbaity headline from a technical person makes me sad :(

blkhawk · on Nov 14, 2023

maybe processes that are persistent get pinned anyway and this would only impact process or thread thrashing where each job is small.

formerly_proven · on Nov 14, 2023

Tasks are pinned to cores anyway.