Hacker News new | past | comments | ask | show | jobs | submit login

This is good advice for a novice, but I have to say from my experience I pretty vehemently disagree.

I suppose there's a sort of spectrum of programmer personality types that may help to explain the disagreement. Please forgive me of I am prone to hyperbole in my description.

One type of programmer says "never rewrite a function if it exists in a library". They do this in part because they are not confident in their own abilities (see disclaimer about hyperbole), and generally trust code if it's written by somebody else.

Another type is more of a cowboy. The cowboy wants to write everything from scratch, for the opposite reasons - because they don't trust code not written by them.

Someone who reliably adheres to only one of these extremes at all times is probably not a good person to work with. However, both of these motivations do have some basis in fact and practicality. On the one hand, use of libraries gets the job done quickly and can save you some headaches. On the other hand, there are some pretty bad to horrible libraries out there, many of them quite popular; introducing dependency on the library can add considerable complexity, code size, and weaken your own "whole-program" understanding of what's going on. I'd say that being distrustful of dubious library dependencies is a good thing.

In practice, even your libc is not necessarily going to be as well maintained as you suggest. The people who maintain such libraries might not have the motivation to do the sort of periodic performance tweaks and updating for the current era that you ascribe to them. They might have a working implementation and decide to stick with it as long as possible. (In my experience this is more common than your rosy picture.) That's fine for them but it doesn't mean the library will work for you, or help you meet your performance goals. One problem with your side of the spectrum is that the people who write the libraries aren't always perfect or even good authorities. You have to know when to let go of your trust in them.

And especially, if I really need something to be faster and it's something that I can spend a few hours on, benchmark, and either decide to take or maybe just revert and go back to the library... Well then yes, I'm going to take a crack at it. One can always revert it later.




Except we're not talking about any old library function here. We're talking about standard library functions such as memcpy, etc. Those functions are optimised specifically for different hardware platforms on every major operating system.

For example, Solaris, Linux, and Windows all feature versions of memcpy that are specifically optimised for different hardware platforms. Intel in particular supplies these optimisation for a number of operating systems directly to the OS vendors.

My advice only applies to the standard library (really, just libc). It's not intended to be nor applicable to anything else for the purpose of this discussion.

For example, in Solaris alone (older version of OpenSolaris actually, but it gets the point across), there are seven easily found versions of memcmp alone:

  http://src.opensolaris.org/source/search?q=&project=onnv&defs=memcmp&refs=&path=libc&hist=
Six of those versions are written in assembler specifically for a particular hardware platform. One is a generic C version. Furthermore, as an example, the amd64 version of memcmp alone has optimisations for at least 14 different cases. Everything from 3DNow! optimisations to optimisations based on data size and alignment.

Also, as for your concerns that OS vendors don't update the standard library functions as often as you think -- you're wrong. I know for certain that Linux, Windows, and Solaris all receive continual updates for new hardware platforms for their standard library. There's a reason you hear these vendors constantly talking about their SPEC benchmark numbers or the like.

So again, as far as standard libraries are concerned -- don't do it; it's not worth it.


While I agree ont he general sentiment of your post, I remember in the case of memcpy that at least the GCC implementation is a "most general case" implementation which tries to have good performance in average usage.

I remember reading about this some time ago when I was doing some tinkering. I read that for example, John Carmack decided to implement his own memcpy tailored for the games they programmed. Additionally, I read about different alternative implementations o memcpy which were better than the standard for specific domains. I even experimented using Microsoft detours to replace the vanilla memcpy with a faster version.


> Six of those versions are written in assembler specifically for a particular hardware platform. One is a generic C version. Furthermore, as an example, the amd64 version of memcmp alone has optimisations for at least 14 different cases. Everything from 3DNow! optimisations to optimisations based on data size and alignment.

None of those optimizations matter in this case because the average query string parameter name is only a few characters in length and an inlined naive loop with compiler optimizations will be faster.


> standard library functions as often as you think -- you're wrong. I know for certain that Linux, Windows, and Solaris all receive continual updates for new hardware platforms for their standard library.

Well I was actually speaking from experience here. I might be more willing to believe you for something like memcpy(), but as an example (and the one I was thinking of), the Windows CRT is generally in pretty poor shape. I've also seen some crufty things in libc trees from some of the *BSDs.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: