Smalltalk JITs are pretty darn good, e.g., the Strongtalk project is known to be pretty fast. As of 2006, Sun released the Strongtalk source code. Its publication record is relatively weak (aside of type system related publications), but it contains a wealth of relevant optimizations and I am sure the interested reader/programmer will find something valuable in there. (http://www.strongtalk.org)
Eliot Miranda has been implementing Smalltalk VMs for quite a while, and I think his recent addition ("Cog") to the Squeak implementation is probably the most recent addition to JIT compilers for Smalltalk VMs. Given his in-depth experience and expertise (particularly with inline caching), this could probably serve as a blueprint for other (Smalltalk) JITs.
V8 for Javascript is supposedly very fast (interesting side information: Robert Griesemer is working on V8, but worked on the Strongtalk interpreter before), but I don't know about the involved benchmarks, and how they stack up against each other--particularly since the TraceMonkey trace-based JITs came along.
Mike Pall's LuaJIT is a very interesting project (only one-man JIT project I know of), too.
Milepost Gcc is interesting. Regehr does not believe in it, though: "machine learning in compilers will not end up being fundamental."
His argumentation is flawed though. He says "I could always get a good optimization result by running all of my optimization passes until a fixpoint is reached," but unfortunately there is no such fixpoint. Many optimizations reverse each other (e.g. loop fusion vs loop spitting) or just arbitrarily choose some normalization (e.g. 2*x vs x+x vs x<<1).
You can build a superoptimizer, which constructs all variations (e.g. equality saturation http://portal.acm.org/citation.cfm?doid=1480881.1480915), though this is no fixpoint search, but an optimization problem to choose the least cost alternative. You can not construct all variations anyway. For a simple example consider loop unrolling an infinite loop.
Hence, unlike Regehr I would not devalue machine learning. I would not bet on it either, though. ;)
In SSA form most operations (add,sub,...) are side-effect free and the rest (load,store,call,...) can be understood as using something like a "Memory Monad".