LtU has an old, but interesting thread. Can’t tell how much details changed from that time, but since 1Gb limit was removed from LJ recently, it is my The Only Dynamic Language to Consider.
Mike Pall indulges in some (well deserved) gloating far down in the thread:
LuaJIT also does: constant folding, constant propagation, copy propagation, algebraic simplifications, reassociation, common-subexpression elimination, alias analysis, load-forwarding, store-forwarding, dead-store elimination, store sinking, scalar replacement of aggregates, scalar-evolution analysis, narrowing, specialization, loop inversion, dead-code elimination, reverse-linear-scan register allocation with a blended cost-model, register hinting, register renaming, memory operand fusion.
Due to the nature of a trace compiler, it implicitly performs partial and interprocedural variants of all of them. And many traditional optimizations, like straightening or unreachable code elimination are unnecessary.
All of that in 120KB for the VM and 80KB for the JIT compiler. And I didn't need 15 years and a billion dollar budget for that, either.
I'm planning to add value-range propagation, array-bounds-check elimination, escape analysis, allocation sinking, if conversion, hyperblock scheduling and auto-vectorization. Anything I forgot? I'll see what I can do. :-)