Very testing and production focused. Biggest competitor is ROS though there’s some others popping up now. Our first public release is going to be within a month, I’m excited.
In my experience shared memory is really hard to implement well and manage:
1. Unless you're using either fixed sized or specially allocated structures, you end up paying for serialization anyhow (zero copy is actually one copy).
2. There's no way to reference count the shared memory - if a reader crashes, it holds on to the memory it was reading. You can get around this with some form of watchdog process, or by other schemes with a side channel, but it's not "easy".
3. Similar to 2, if a writer crashes, it will leave behind junk in whatever filesystem you are using to hold the shared memory.
4. There's other separate questions around how to manage the shared memory segments you are using (one big ring buffer? a segment per message?), and how to communicate between processes that different segments are in use and that new messages are available for subscribers. Doable, but also not simple.
It's a tough pill to swallow - you're taking on a lot of complexity in exchange for that low latency. If you can do so, it's better to put things in the same process space if you can - you can use smart pointers and a queue and go just as fast, with less complexity. Anything CUDA will want to be single process, anyhow, (ignoring cuda IPC, anyhow). The number of places where you need (a) ultra low latency (b) high bandwidth/message size (c) can't put everything in the same process (d) are using data structures suited to shared memory and finally (e) are okay with taking on a bunch of complexity just isn't that high. (It's totally possible I'm missing a Linux feature that makes things easy, though).
I plan on integrating iceoryx into a message passing framework I'm working on now (users will ask for SHM), but honestly either "shared pointers and a queue" or "TCP/UDS" are usually better fits.
> In my experience shared memory is really hard to implement well and manage:
I second that. It took us quite some time to get the correct architecture. After all, iceoryx2 is the third incarnation of this piece of software, with elfepiff an me working on the last two.
> 1. Unless you're using either fixed sized or specially allocated structures, you end up paying for serialization anyhow (zero copy is actually one copy).
Indeed, we are using fixed size structures with a bucket allocator. We have ideas on how to enable the usage on types which support custom allocators and even with raw pointers but that is just a crazy idea which might not pan out to work.
> 2. There's no way to reference count the shared memory - if a reader crashes, it holds on to the memory it was reading. You can get around this with some form of watchdog process, or by other schemes with a side channel, but it's not "easy".
>
> 3. Similar to 2, if a writer crashes, it will leave behind junk in whatever filesystem you are using to hold the shared memory.
Indeed, this is a complicated topic and support from the OS would be appreciated. We found a few ways on how to make this feasible, though.
The origins of iceoryx are in automotive and there it is required to split functionality up into multiple processes. When one process goes down, the system can still operate in a degraded mode or just restart the faulty process. With this, one needs an efficient and low-latency solution else the CPU is spending more time on copying data than on doing real work.
Of course there are issues like the producer mutating data after delivery, but here are also solutions for this. It will of course affect the latency but should still be better than using e.g. unix domain sockets.
Fun fact. For iceoryx1 we supported only 4GB memory chunks and some time ago someone came and asked if we could lift this limitation since he wanted to transfer a 92GB large language model via shared memory.
Thanks for sharing here -- yeah these are definitely huge issues that make shared memory hard -- the when-things-go-wrong case is definitely quite hairy.
I wonder if it would work well as a sort of opt-in specialization? Start with TCP/UDS/STDIN/whatever, and then maybe graduate, and if anything goes wrong, report errors via the fallback?
I do agree it's rarely worth it (and same-machine UDS is probably good enough), but with the 10x gain essentially I'm quite surprised.
One thing I've also found that actually performed very well is ipc-channel[0]. I tried it because I wanted to see how something I might actually use would perform, and it was basically 1/10th the perf of shared memory.
The other thing is 10x improvement on basically nothing is quite small. Whatever time it takes for a message to be processed is going to be dominated by actually consuming the message. If you have a great abstraction, cool - use it anyhow, but it's probably not worth developing a shared memory library yourself.
```
(lldb) run
Process 14345 launched: '/anti-debug/swift/build/anti_debug' (arm64)
start pid = 14345
exit parent process for child pid = 14348
continue as child process pid = 14348
Process 14345 exited with status = 0 (0x00000000)
```
The main reason for using dlsym instead of calling fork directly is to make it harder for an ‘attacker’ to detect or set breakpoints on the fork function, thus obfuscating the anti-debugging mechanism. You have to more checks before being able to understand why you cannot attach the debuger.
You may still think that mode could be still able to catch a new child process but apparently people have tried and the answer is no
Using a service for this would kind of suck if you're _in_ the national park, due to cell service. The ML portion of this is probably still a bit harder than the GPS bit.
ML:
Grab yolov8
(Optional?) Fine tune it on bird pictures
Convert it to CoreML, do whatever iOS stuff is needed in XCode to run it
GPS:
Get https://www.nps.gov/lib/npmap.js/4.0.0/examples/data/national-parks.geojson (TIL this exists! thanks federal govt!)
Stuff it into the app somehow
Get the coordinates from the OS
Use your favorite library for point+polygon intersection to decide if you're in a national park
Bonus: use distance from polygon instead to account for GPS inaccuracy, keeping in mind lat and long have different scales.
...actually the ML one might be easier, nowadays. Now I kind of want to try this.
> Use your favorite library for point+polygon intersection to decide if you're in a national park
it's like 50ish lines of code in C, just iterating over the points (with the polygon represented by arrays of points). The algorithm is linear with regards to the points.
Buying a retro game no longer gives money to the developers. It doesn’t support the development of further games, it mostly just puts money in the pockets of collectors and scalpers.
Except you don't have to hunt down the descendants of the grocery store or who acquired the company that acquired them that sold it for parts, decades later. With old games the rights holders are scattered and even unknown until they assert a claim they believe they have.
It's not, though. I'm not buying a 30 year old tomato, like I would a 30 year old game. I'm going to keep buying a tomato every week to make my BLT, enabling the grocery store to continually give money to the farmer for more tomatoes. I'm not going to be continually rebuy the same old game on a regular basis. Me buying tomatoes encourages the farmer to grow tomatoes next year, the same is not true for retro games.
One of these is a continual economic...pipeline of the tomatoes being exchanged for money, and being consumed. The other is speculation and collection. Please don't get hung up on the terminology, I'm not an economist.
reply