Hacker News new | past | comments | ask | show | jobs | submit login

I don't see it listed in the notes, but using the --sort-files flag no longer seems to have a performance penalty.



It still disables parallelism. No changes there. If you have anneasily reproducible benchmark on a public corpus, I could try to provide an analysis. (Note: if the corpus is small enough, it might not be possible to witness a perf difference. Or it may even be possible that using parallelism is slower due to the overhead of spinning up threads.)


Interesting!

Perhaps the difference is then explained by my choice of search term. The term I tried after upgrading must happen to appear early in the sorted corpus.

I just now tried it with a very rare term, and it does indeed take longer overall to complete the search.


Any idea how this works compared to eg GNU sort --parallel? (or clever tricks with partial sort and merge)?

I'm guessing rg can be faster in general - because of less memory allocation/copying by sorting before outputting?


I'm not familiar with GNU sort's --parallel flag.

That --sort-files disables parallelism is not really a theoretical limitation.

See: https://github.com/BurntSushi/ripgrep/issues/152




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: