Thanks for the elaborate reply! Our benchmark input data contains customer data ...

Thanks for the elaborate reply! Our benchmark input data contains customer data which we cannot share, but we could share the benchmark program.

The DFA approach is something that I had not heard about before (I never looked into what "full" meant), but we should definitely try it; we reuse the automaton against possibly millions of haystacks, so extra memory or preprocessing time is likely worth it. Thanks for pointing this out!

An adaptive approach would indeed work better for fewer needles, but the threshold depends a lot on the data. We compared Alfred–Margaret against ICU regexes for case-insensitive replaces, and we found that even for a single needle, ours was faster in half of the cases. (It is a bit unfair because of FFI call overhead.)