Isn't RETE a forward-chaining algorithm while Prolog is a backward-chaining language? I'm not sure you could use RETE for the actual Prolog semantics. Currently, fast implementations of Prolog use a virtual machine, the most popular is WAM, that can be compiled then to machine code (GNU Prolog does it). Also some implementations have JIT indexing, which improves performance too.
The rete algorithm was supposed to be a solution but has anyone applied rete to a prolog implementation yet?