Hacker News new | past | comments | ask | show | jobs | submit login

I pulled apart the innards of CRIU because I needed to be able to checkpoint and restore a process within a few microseconds.

The project ended up being a dead end because it turned out running my program in a QEMU whole system vm and then fork()ING QEMU worked faster.




There is a QEMU fork used by Nyx fuzzer, may be interesting to you https://github.com/nyx-fuzz/QEMU-Nyx

Basically, for the fuzzing purposes speed is paramount so they made some changes to speed up snapshot restoring. Don't know the limitations but since it is used to fuzz full operating systems, there should not be many.

I believe it should be faster than forking because why even patch QEMU otherwise.


could you tell me a bit more about what you're doing?


The goal was to have a web browser (chromium) able to 'guess' stuff about what response it will get from the network (ie. Will the server return the same JavaScript blob as last time). We start executing the JavaScript as-if the guess is correct. If the guess is wrong, we revert to a snapshot.

It lets you make good use of CPU time whilst waiting for the network.

It turns out simple heuristics can get 99% accuracy on the question of 'will the server return the same result as last time for this non-cachable response'.

However, since my machine has many CPU cores it made sense to have many 'speculative' copies of the browser going at once.

A regular fork() call would have worked, if not for the fact chromium is multi thread and multi process, and it's next to impossible to fork multiple processes as a group.


Terrifying, I love it :) How was the performance in the end? Did you get a good speculation success rate?

It'd be cool to predict which resources are speculation safe (ie the cache headers don't permit it, but the content in practice doesn't change) and speculate those resources but not ones which you have repeatedly had a speculation abort (ie actual dynamic resources). If your predictor gets a high enough hit rate, you could probably do okay with just a single instance/no snapshot and use an expensive rollback mechanism (reload the whole page non-speculatively?).


Sorry if I'm being thick, but why not just cache the response?

If you are guessing at the data anyway, what's the difference?

Why set up an entire speculative execution engine / runtime snapshot rollback framework when it sounds like adding heuristic decision caching would solve this problem?


Sounds like they were caching it since they could execute it before getting the response. The difference is that they wanted to avoid the situation where they execute stale code that the server never would've served. So they can execute the stale code while waiting for the response then either toss the result or continue on with it once they determine if the server response changed.


How else will you discover new and exciting speculative execution vulnerabilities? /s


Couldn't you just change chrome so that it forks the tabs and runs them in the background? That seems a lot easier.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: