We've only been using SCHED_FIFO on Linux, which is basically busywaiting if you only have one thread scheduled. Though I am interested in trying the SCHED_DEADLINE.
I hope someone adds a kcompact processor mask / process state disabling compaction.
There was also this recent patch https://lwn.net/Articles/816211/ to deal with kthread affinities. Even with isolcpus I find I still need to run pgrep -P 2 | xargs -i taskset -p -c 0 {} and deal with the workqueues.
Running only a single thread per core, I see no difference between SCHED_FIFO vs SCHED_OTHER. Except SCHED_FIFO can cause lockups if running 100% since cores are not completely isolated (ie vmstat timer and some other stuff).
I like the Tosatti/WindRiver/Lameter patch. (Except the naming: _possible is the same meaning as _available, but they mean different things here depending if kthreads or user threads?) Just needs a proc interface.
With a patch like this you can force bottom half (interrupts), top half (kthreads) and user threads to all be on different cores.
The 'full task-isolation mode' seems wierd, because why should you drop out of isolation because of something outside your control like paging or TLB? Anyway, mlockall that. Its fine to be told I guess (except signals take time) but why drop out of isolation and risk glitches in re-isolating? It doesn't seem very polished.
Something else occured to me: you still have to be careful about data flow and having enough allocatable memory. E.g., a lot of memory local to core 0 will be consumed by buffer cache, it can be beneficial to drop it (free; numstat -ms; sync; echo 1 > /proc/sys/vm/drop_caches; free; numastat -ms).
I find it bizarre that all this THP compaction stuff is for workloads that are commonly run under a virtual machine, i.e. another layer of indirection.
As I understand 'full task-isolation mode' will prevent compaction, completely disable vmstat timer etc. So it provides additional isolation. Since you already switched into kernel mode, might as well deliver a signal to let you know it happened. If the signal is masked there should be no overhead at all except a branch to check the signal mask.
You can patch the kernel thread creation to not use specific CPUs... IDK of anyone publically maintaining a patchset, but look at https://github.com/torvalds/linux/blob/master/kernel/kthread... line 386.
We've only been using SCHED_FIFO on Linux, which is basically busywaiting if you only have one thread scheduled. Though I am interested in trying the SCHED_DEADLINE.
I hope someone adds a kcompact processor mask / process state disabling compaction.