I don't see it listed in the notes, but using the --sort-files flag no longer se...

burntsushi · on June 14, 2021

It still disables parallelism. No changes there. If you have anneasily reproducible benchmark on a public corpus, I could try to provide an analysis. (Note: if the corpus is small enough, it might not be possible to witness a perf difference. Or it may even be possible that using parallelism is slower due to the overhead of spinning up threads.)

piinbinary · on June 14, 2021

Interesting!

Perhaps the difference is then explained by my choice of search term. The term I tried after upgrading must happen to appear early in the sorted corpus.

I just now tried it with a very rare term, and it does indeed take longer overall to complete the search.

e12e · on June 14, 2021

Any idea how this works compared to eg GNU sort --parallel? (or clever tricks with partial sort and merge)?

I'm guessing rg can be faster in general - because of less memory allocation/copying by sorting before outputting?

burntsushi · on June 14, 2021

I'm not familiar with GNU sort's --parallel flag.

That --sort-files disables parallelism is not really a theoretical limitation.

See: https://github.com/BurntSushi/ripgrep/issues/152