Hacker News new | past | comments | ask | show | jobs | submit login

I did some benchmarking for a dictionary app a while back, and the fastest I could get was strstr(), if I recall correctly. It substantially beat std::string::find(). Using a loop is bound to give terrible performance; for large data you should be using one of the string-search algorithms (which strstr() surely does). And given the very large search string, those string-search algorithms should be able to skip large chunks at a time.

Here's a small version that allocates 200 MB of uninitialized memory.

  #include <stdlib.h>
  #include <string.h>
  int main(int argc, char *argv[])
  {
	int size = 200 * 1024 * 1024;
	char *bytes = malloc(size);
	bytes[size - 1] = '\0';
	strstr(bytes, "788bbd6b12d79a4ee18e36a8a264fb496465c71b7932a965ed731397b7c97d14");
	return 0;
  }
  
  > gcc test.c
  > time ./a.out
  real	0m0.006s
  user	0m0.002s
  sys	0m0.002s
(Since the author ignores time to read off disk, I figure uninitialized memory should be comparable. The chances of strstr() actually finding anything are pretty slim, but I ran it a number of times just to be sure, with the same results.)

This is on a 2012 MacBook Pro.




Note that strstr will stop as soon as you have a 0 byte so this is not a fair comparison.

Uninitialized (practically often zeroed) also produces different results from random and explicitly zeroed, since I assume it affects how often it can fail fast.

    #define _GNU_SOURCE
    #include <stdlib.h> 
    #include <stdio.h> 
    #include <string.h> 
    #include <time.h> 

    int main(int nargs, char ** args) 
    {
      struct timespec start; 
      struct timespec end; 
      int mode = atoi(args[1]); 
      const char * needle = "788bbd6b12d79a4ee18e36a8a264fb496465c71b7932a965ed731397b7c97d14";  
      const int N = 200*1024*1024 ;
      char * haystack = malloc(N + sizeof(needle)); 
      strcpy(haystack+N,needle); 

      printf("Mode is %s\n" , mode == 0 ? "zeroed" : 
                              mode == 1 ? "random"        : 
                              "uninitialized"); 

      if (mode == 0) 
      {
        memset(haystack,0,N); 
      }
    
      if (mode == 1) 
      {
        FILE * urandom = fopen("/dev/urandom","r"); 
        int nread = fread(haystack,1,N,urandom); 
        fclose(urandom); 
        if (nread < N) 
        {
          fprintf(stderr,"read returned %d\n",nread); 
        }
      }
    
      clock_gettime(CLOCK_REALTIME, &start); 
      void * found = memmem(haystack,N+sizeof(needle),needle,sizeof(needle)); 
      clock_gettime(CLOCK_REALTIME, &end); 
    
      printf("Took %g secs to find string at offset of %llu\n", end.tv_sec - start.tv_sec + 1e-9*(end.tv_nsec -    start.tv_nsec), found-(void*)haystack); 
      return 0; 
    }



    $ gcc test_find.c -o test_find
    $ ./test_find 0
      Mode is zeroed
      Took 0.0147023 secs to find string at offset of 209715200
    $ ./test_find 1
      Mode is random
      Took 0.129951 secs to find string at offset of 209715200
    $ ./test_find 2
      Mode is uninitialized
      Took 0.0371922 secs to find string at offset of 209715200


Note that if you malloc that big of a block, it's going to come directly from a MAP_ANONYMOUS mmap, and so it will be zero initialized!

Your uninitialized mode is slower than zeroed, but nowhere near as slow as random.

The difference between initialized and uninitialized may be due to page faults: in the uninitialized case, the search is touching the memory for the first time. Maybe we're seeing TLB misses?


Yeah I was expecting the uninitialized to be the same as the zeroed case for that reason, but was surprised to see a (slightly) different result.

Edit: you are almost certainly right. I added a getrusage call before and after, and we have ~50k soft page faults in the uninitialized case and ~2 in the zeroed case during the memmem.


Reading uninitialized allocation triggers undefined behavior. The compiler is free to do whatever it pleases. Don't do that.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: