This was many years ago and I don't remember exact sources. It was more of a research than a normal development project. I spent most of my time researching techniques needed to write that kind of application.
The largest influence were definitely LMAX articles and Disruptor pattern.
SBCL was comparatively easy. Basically, if anything caused problems I just moved it to C and called using FFI. Think in terms of writing a C program but using pieces of assembly when C is not enough for some reason.
Even though a large part of small pieces was moved to C, the application still felt like Lisp. It just orchestrated a large library of utilities to do various things. I still had fully functional REPL, for example.
The hard part of the project was full kernel bypass. After the initial setup, the application stopped talking to Linux kernel except for one CPU core that was devoted to running OS threads, some necessary applications and some non-performance-critical threads of the algotrading framework (like REPL, persistence, etc).
All except for one core were completely owned each by a single thread of the application and never did any context switch after initial setup.
I hadn't appreciated that level of thread isolation was possible in SBCL, but of course it makes sense. Presumably you had some kind of instrumentation to let you know if a stray syscall had slipped in?
> Presumably you had some kind of instrumentation to let you know if a stray syscall had slipped in?
Don't know if you can call it instrumentation... I wrote an extremely hacky patch for the kernel to detect when any piece of kernel is running on anything than core 0 after certain flag was set.
Remember, it is not just syscalls. Even something as simple as accessing memory can cause switch to kernel to resolve TLB entry if you don't set up your memory correctly to prevent this from happening.
Yeah... I know. Now a bunch of people will come and explain how this could be done the right way. I just didn't care at the time to invest more time than necessary to get this right.
I assume that you used something like isolcpus kernel parameter, then sefaffinity to put the process on the isolated CPU.
I'm very interested in the kernel bypass technique to talk to the hardware, I'm assuming it was the NICs ring buffer.. how was this achieved in userspace?