while work:
if (contenders > 1)
{ reduce_priority; pri_reduced = true }
take_lock
if (pri_reduced)
{ unreduce_priority; pri_reduced = false }
do_work_quantum
drop_lock
endwhile
(I mean empirically, although an educated guess would do).
Of course, if reducing priority is too fast, this likely doesn't help; alternatively it could be too slow and what you get back in system latency is taken away in lowered throughput. That's probably not OK if you don't need the system latency to be low.
I wonder if (dramatically, even) reducing the priority of some of the original workload exposing the problem, not just when racing for a lock but when doing the actual work quanta, would help. My thought here is that your email-sending is higher-prirority and will at least push some of the workload out of the way in reasonable time, giving you back some responsiveness.
I'm surprised if Windows doesn't offer up a high-throughput/latency-tolerant QOS for threads.
Priorities don't help. They are only relevant if there are more runnable threads than CPUs. In my case I had lots of spare CPUs so both threads could run, regardless of priority.
A QOS does not directly help. The only thing I am aware of that can help is fair locks, or occasionally fair locks, so that the lock is given directly to the waiting thread, instead of being made available to all.
How about:
(I mean empirically, although an educated guess would do).Of course, if reducing priority is too fast, this likely doesn't help; alternatively it could be too slow and what you get back in system latency is taken away in lowered throughput. That's probably not OK if you don't need the system latency to be low.
I wonder if (dramatically, even) reducing the priority of some of the original workload exposing the problem, not just when racing for a lock but when doing the actual work quanta, would help. My thought here is that your email-sending is higher-prirority and will at least push some of the workload out of the way in reasonable time, giving you back some responsiveness.
I'm surprised if Windows doesn't offer up a high-throughput/latency-tolerant QOS for threads.