Keep in mind it doesn't magically wake up descheduled tasks. So it's necessary to go through the kernel if the destination task is not currently running on a CPU. The latency in that case will be similar to what we have today.
And in cases where you can guarantee that the destination task is running you can already uses shared memory for low-latency communication today (polling or mwait).
I'm not saying userspace interrupts are useless but they are not as convincing as it seems at first glance. I think more proof of concepts (enabling real applications) and benchmarking are needed to demonstrate the advantages.
This is a fair point, but dropping the kernel patch guarantees that essentially nobody will use this.
Given part of the patch was software emulation for unsupported CPUs, it's possible that people would've implemented support for it anyway, incentivising Intel to push it more broadly.
In general my hope is that runtimes pick up the good stuff & roll with it. Io_uring hasn't exactly been a stunning success on nidejs/libuv but the promise is so real that runtimes can take sweet io capabilities like io_uring or usersoace interrupts & boom, now everyone's ok is faster.
It seems to require the kernel to set the IA32_UINTR_TT MSR before it does anything, so at least you can't send random IPIs to other processes from userspace.
I wonder about the cost of supporting these unused features in Intel chips. Will Intel keep including it forever even if no one adopts it? Drop it? (It is controlled by a CPUID feature flag so that is possible.)
And in cases where you can guarantee that the destination task is running you can already uses shared memory for low-latency communication today (polling or mwait).
I'm not saying userspace interrupts are useless but they are not as convincing as it seems at first glance. I think more proof of concepts (enabling real applications) and benchmarking are needed to demonstrate the advantages.