Hacker News new | past | comments | ask | show | jobs | submit login

You might be thinking of -F/--fixed-strings. -s is slient (long option --no-messages). For GNU grep 2.12, anyway. Or you might be thinking of BSD grep:

http://lists.freebsd.org/pipermail/freebsd-current/2010-Augu...

edit: Eg, with warm cache:

    :~/tmp/riak/riak-2.0.0pre5/deps $ time (find . -type f -exec cat '{}' \; |wc -l)
    2765699

    real    0m5.021s
    user    0m0.144s
    sys     0m0.792s
    :~/tmp/riak/riak-2.0.0pre5/deps $ time (find . -type f -exec cat '{}' \; |grep -E 'Some  pattern' -v -c)
    2765700

    real    0m5.133s
    user    0m0.264s
    sys     0m0.852s
    :~/tmp/riak/riak-2.0.0pre5/deps $ time (find . -type f -exec cat '{}' \; |grep -E 'Some..pattern' -v -c)
    2765700

    real    0m5.144s
    user    0m0.400s
    sys     0m0.768s

    # "%% " used for leading comment lines in some of this code:
    :~/tmp/riak/riak-2.0.0pre5/deps $ time (find . -type f -exec cat '{}' \; |grep -E '^%% ' -c)
    27535

    real    0m5.597s
    user    0m0.520s
    sys     0m0.788s

    :~/tmp/riak/riak-2.0.0pre5/deps $ du -hcs .
    405M    .
    405M    total

    :~/tmp/riak/riak-2.0.0pre5/deps $ time (find . -type f -exec cat '{}' \; |ag '^%% ' >/dev/null)

    real    0m5.735s
    user    0m1.480s
    sys     0m0.876s
   
    #actually find/cat is pretty slow -- I guess both GNU grep and ag
    #use nmap to good effect:

    $ time rgrep '^%% ' . > /dev/null

    real    0m0.539s
    user    0m0.404s
    sys     0m0.128s
    
    :~/tmp/riak/riak-2.0.0pre5/deps $ time ag '^%% ' . |wc -l
    27500

    real    0m0.252s
    user    0m0.284s
    sys     0m0.068s

    :~/tmp/riak/riak-2.0.0pre5/deps $ time rgrep -E '^%% ' . |wc -l
    27553

    real    0m0.535s
    user    0m0.396s
    sys     0m0.140s

Note that grep clearly goes looking in more files here (more mathcing lines). Still, I guess ag is indeed faster than grep in some cases (even if it might not be apples to apples depending how you count -- of course the whole point of ag is to help search just the right files).

    :~/tmp/riak/riak-2.0.0pre5/deps $ time rgrep -E 'Some  pattern' . |wc -l
    0

    real    0m0.266s
    user    0m0.128s
    sys     0m0.132s
    :~/tmp/riak/riak-2.0.0pre5/deps $ time rgrep -E 'Some..pattern' . |wc -l
    0

    real    0m0.338s
    user    0m0.212s
    sys     0m0.120s
    :~/tmp/riak/riak-2.0.0pre5/deps $ time ag 'Some..pattern' . |wc -l
    0

    real    0m0.111s
    user    0m0.100s
    sys     0m0.076s
I guess ag is indeed faster, even if it might not be due to fixed string search...

[edit2: For those wondering that's an (old) ssd, on an old machine -- but with ~4G ram the working set should fit, as soon as some of my open tabs in ff are paged to disk...]




Thanks for benchmarking ag against grep. You're right that it's not exactly apples to apples. Ag doesn't search as many files, but it does parse and match against rules in .ag/.git/.hgignore. Also, ag prints line numbers by default, which can be an expensive operation on larger files.

I think most of the slowdown you're seeing with "find -exec | cat" is forking at least two processes (ag and cat) for each file. Also, each process has to be run sequentially (to prevent garbled output), which makes use of only one CPU core most of the time. I've tried to keep ag's startup time fast so that classic find-style commands still run quickly. (This is why ag doesn't support a ~/.agrc or similar.)

Just FYI, you can use ag --stat to see how many files/bytes were searched, how long it took, etc. I think I'll add some stats about obeying ignore rules, since some of those can be outright pathological in terms of runtime cost. In many cases, ag spends more time figuring out what to search than actually searching.


I tried to gauge the cpu usage (just looking at the percentage as listed in xmobar) -- but both grep and ag are too fast on the ~400mb set of files for that to work... As I have two cores on this machine, the difference between ag and rgrep could indeed be ag's use of threads.

Many thanks for not just writing and sharing ag as free software, but for the nice articles describing the design and optimizations!

At least this brief benchmarking run convinced me that I should probably try to integrate ag in my work flow :-)


Quickly reviewing some of the posts on the ag blog/page[1], I'm guessing the speedup is mainly from a custom dir scanning algorithm and possibly from running two threads.

In the course of checking out ag (again) I also learned about gnu id-utils[2].

[1] http://geoff.greer.fm/ag/ [2} http://www.delorie.com/gnu/docs/id-utils/id-utils_1.html


This was very confusing until I realized HN had invisible code boxes with a fixed width.


There should be a scrollbar on the bottom (I kept the commands on one line, rather than splitting with "\"). Might not be on mobile, though? In other words, the code-boxes should have overflow:scroll or something to that effect.


I think Chrome on OSX hides scroll bars by default unless you're scrolling. Regardless, the box is tall enough that it doesn't fit in my viewport so I wouldn't see the bottom scrollbar anyway.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: