This is interesting, but the first comment in that site is apt:
> Meanwhile, the Fortran programmer writes y = dot_product (x, x) and moves on to the interesting bits. Plus, if auto-parallelization is on, or if this is in an OpenMP workshare section…
This is titled "optimizing loops in C" but this level of optimization is actually programming in assembly language, specifically x86 with SSE extensions.
The C programmer will actually just write y = cblas_sdot(n, x, 1, x, 1) and move on to the interesting bits.
However, someone had to implement dot_product and cblas_sdot at some point (either in the compiler or in the library), and they need to be rewritten from time to time for new architectures. More to the point, most programmers, most of the time, aren't just doing a dot product. They're doing some other more sophisticated computation for which these techniques may be quite relevant. The dot product is just a convenient example.
Agreed on both counts: someone had to write those library functions, and this was just an example.
However, the author starts by talking about a "holy war between C and Fortran" and then proceeds to write... well, assembly language using the C compiler. So the summary could be "assembly language can be made more efficient than Fortran, and C lets you coax the compiler into writing the assembly language you want". I guess this could be seen as a win for C, but I'm not so sure...
As the reply in the original article explains, it is discussing the implementation details of such a function (which could equally exist in a C library). Simply saying 'call a library function' doesn't help anyone understand what is going on behind the scenes.
> Meanwhile, the Fortran programmer writes y = dot_product (x, x) and moves on to the interesting bits. Plus, if auto-parallelization is on, or if this is in an OpenMP workshare section…
This is titled "optimizing loops in C" but this level of optimization is actually programming in assembly language, specifically x86 with SSE extensions.