Author here. Thanks for submitting my article on hackernews. I'll do my best to ...

winternewt · 2024-03-18T16:13:54.000000Z

Is there a fundamental difference between copy and patch with C and what compilers do when they target intermediate representations? It seems to me that traditional compilation methods are also "copy and patch" but with another intermediate language than C.

tetha · 2024-03-18T17:58:22.000000Z

I think conceptually, there is no real difference. In the end, a compiler outputting machine code uses very small stencils, like "mov _ _", which are rather simple to patch.

Practically though, it's an enormous difference, as the copy and patch approach re-uses the years of work going into clang / gcc supporting platforms, optimizations for different platforms and so on. The approach enables a much larger pool of people ("People capable of writing C" vs "People capable of writing assembly / machine code") to implement very decent JIT compilers.

pinaraf · 2024-03-18T22:54:52.000000Z

The real difference is in the possible optimizations. If you consider the full scope of JIT compilation in for instance a web browser or the JVM, you could use copy and patch as a tier 0 compiler, and once really hot paths are identified, trigger a complete compiler with all the optimizer steps. Some optimizations are more complicated to implement with copy-patch, esp. if you can't use all the tricks described in the paper (for instance they use the ghccc calling convention to get a much finer register allocation, but from the documentation I don't think it's going to make it for PostgreSQL).

But as you say, yes, this enables people capable of writing C and reading assembly (or you have to be perfect and never have to go into gdb on your compiled code), and it makes the job so much faster and easier... Writing several machine code emitters is painful, and having the required optimization strategies for each ISA is quickly out of reach.

aengelke · 2024-03-19T13:33:21.000000Z

Thanks for the blog post, it's always nice to see performance improvements in Postgres! I'm curious: how much time is spent for LLVM on some real queries and how is LLVM configured (i.e., which passes, which back-end optimization, etc.)? In our experience [1], LLVM can be reasonably fast when optimized for compile time without optimizations and the -O0 back-end pipeline, but obviously still 10-20x slower compared to other approaches.

Also, in our experience, copy-and-patch-generated code tends to be quite slow and hard to optimize (we tried some things [2; Sec. 5], but there's still quite a gap (see Fig. 3 for a database evaluation)). Do you have some numbers on the run-time slowdown compared to LLVM? Any future plans for implementing multi-tiering, so dynamically switching from the quickly compiled code to LLVM-optimized code?

[1]: https://home.in.tum.de/~engelke/pubs/2403-cgo.pdf [2]: https://home.in.tum.de/~engelke/pubs/2403-cc.pdf

o11c · 2024-03-19T06:06:13.000000Z

Is copy-and-patch really a new idea, or just a new name for an old idea?

When I learned programming (and interpreters particularly) around 2010, I thought it was well-known that you could memcpy chunks of executable code that your compiler produced if you were careful ... the major gotcha was that the NX bit was just starting to take off at the time (Even on Linux, most people still assumed 32-bit distros and might be surprised that their CPUs even supported 64-bit. At some point I ended up with a netbook that didn't support 64-bit code at all ...).

Unfortunately I ended up spending too much time on the rest of the code to actually look deeply enough into it to build something useful.

aengelke · 2024-03-19T13:08:32.000000Z

It is an old idea with a new name. For example, QEMU orginally worked like this [1] and they already used relocations to patch in constants, before they later moved to TCG for higher-quality code.

[1]: https://www.usenix.org/legacy/event/usenix05/tech/freenix/fu...

pgaddict · 2024-03-18T13:23:24.000000Z

Would be a great topic for pgconf.eu in June (pgcon moved to Vancouver). Too bad the CfP is over, but there's the "unconference" part (but the topics are decided at the event, no guarantees).

mattashii · 2024-03-18T14:34:53.000000Z

Did you mean pgconf.dev in May (which has the unconference), or pgconf.eu in October (which doesn't have an unconference, but the CfP will open sometime in the - hopefully near - future)?

pgaddict · 2024-03-18T16:13:58.000000Z

Yeah, I meant May. Sorry :-( Too many conferences around that time, I got confused.

That being said, submitting this into the pgconf.eu CfP is a good idea too. It's just that it seems like a nice development topic, and the pgcon unconference was always a great place to discuss this sort of stuff. There are a couple more topics in the JIT area, so having a session or two to talk about those and how to move that forward would be beneficial.

frizlab · 2024-03-18T13:12:56.000000Z

Not a question, but I love this. I’m eager to see its evolution.

can3p · 2024-03-18T13:24:35.000000Z

Nice post, thanks! Do I read it right that using jit results in the worst max times? What could be a reason in your opinion?

pinaraf · 2024-03-18T13:35:23.000000Z

Two parts: I did the benchmark on a laptop and didn't spend enough time forcing its runtime PM in a fixed state, I'll run a real pgbench on my desktop once I implement all required opcodes for it. And since JIT requires a minimum amount of time (about 300us on my tests), on such small runtimes this can quickly overcome the benefits.