Any information about the performance implications of this library? This looks quite nice to use, but I imagine it has some performance cost versus more traditional ways of writing Fortran code, which could (sadly) mean this won't ever get much use in real-world code. Nonetheless, it's awesome that a language that's been around for 60 some-odd years is still evolving and making use of modern programming techniques. It's something to think about for people making languages now.
Thank you for the kind words. I cannot comment on the performance hit yet but this is something I plan to examine soon. Based on experience I can guess that the performance decrease from using recursion could be about an order of magnitude or more for large arrays. Fortran really shines performance-wise when applied to heavy computations on very large multi-dimensional arrays - I don't think functional techniques should be applied in such cases. However, I think that Fortran programmers can benefit by applying functional techniques on parts of code that do a lot of complex logic but are not computationally heavy. Pareto principle is especially valid in HPC Fortran codes - most compute time is spent in a small % of code base. Those should be left alone.
One of the issues that really leaped out at me was head and tail. In the functional paradigms where those are common, they're both O(1) because they apply to cons lists, which are a form of linked list. Head is probably still O(1) but if tail is O(n) you really need to point that out, as even with relatively small data sets O(n^2) can be noticeable.
Perhaps it's obvious to a Fortran programmer that it will be O(1) because Fortran has slicing built in, I dunno, but still, documenting the general space and time complexity would be worth doing. (Plus even if Fortran does slices natively, it's still nice in a library like this to guarantee to your users that you're using it correctly and it won't copy n-1 elements on every call, and to clearly say whether they can safely modify the results. As I assume Fortran is not big on the immutable arrays.)
in the implementation of the tail_xxx functions does allocate a new array.
I wrote a test program to use foldl to calculate the sum of an array, and it seems to have n^2 time complexity. Also a test program summing an array of length 10000 of 64bit integers uses 395816 kbytes of memory, which is close to 8 * 10000^2 / 2 / 1024 = 390625.
Thank you jerf and sampo for your very helpful comments. This is something that I had on my mind during the implementation but focused primarily on expressiveness.
While the array slicing in Fortran is O(1), the implementation of tail does make a copy. Though a good compiler should be able to replace tail(x) with x(2:), the Fortran standard does not guarantee it to do so. The first step forward that I see would be to replace tail(x) with x(2:) in the internal implementations of fold[lr].
In any case, the time complexity for these functions is definitely something I plan to document soon.
> replace tail(x) with x(2:) in the internal implementations of fold[lr].
Tried this for foldl with gfortran. Totally helps. Summing 10 000 int64's, memory consumption drops from 400M to 4M. Also time complexity becomes linear.
I don't see what's basically wrong with functional approaches to array programming; see the APL family and SISAL?
There is, or was, a definite problem with recursion in gfortran (when I looked some time ago). You can't (or couldn't, then) get tail call elimination in functions. I'd have hoped GCC would have been able to see through the structure of the assignment/return.
I really like your project. I do question its usefulness however. If you get any performance hit from using these, you beat the point of using Fortran (over say Python) in the first place :) Don't you think?
Yes, and having worked with Fortran code for years, we used the routines to calculate... and Python to analyze the data. A colleague even scrapped old Fortran routines in favor of Python + Numpy, as he could be more productive without a noticeable slowdown in his work, as the data analysis part was the bottleneck. So I reiterate my point: I like the project, I like the idea, but we need more information regarding the performance of the routines. Otherwise, using e.g. Python full-stack could be much less of a hassle.
Did you miss the point the OP just made above? Performance-critical code paths can be left alone while the usual set-up/tear-down code can make good use of this library.
While this is a common view, there is no reason to believe it is true. There was a related discussion on the D mailing list.[1] Functional code can be as performant or even better than explicit loops. [2] According to Walter Bright, a properly optimized compiler should be able to produce faster code when you avoid loops. [3]
I realize that Fortran compilers have focused on optimizing traditional explicit loops and other imperative programming constructs. Nonetheless, I'd want to see benchmarks before concluding this is a lost cause.
And of course for a lot of code it is better to write correct code quickly than it is to spend a week cutting the running time of a function called twice from 1.2 seconds to 0.9 seconds.
> I realize that Fortran compilers have focused on optimizing traditional explicit loops and other imperative programming constructs. Nonetheless, I'd want to see benchmarks before concluding this is a lost cause.
Fortran since f90 has had syntax for and encouraged array level operations rather than looping over arrays. Surely there has been work on improving the performance of vectorised operations?
Fortran has had lots of implicit loops to optimize since 1990, so, as long as the compiler can see to the bottom of the call (via cross-module inlining), it will do a great job.
Interestingly, Fred Chow and the SGI compiler/Open64/Path64 etc. optimizes implicit loops by expanding them into explicit ones, and then making sure that the regular optimizer does a great job on that.
> Interestingly, Fred Chow and the SGI compiler/Open64/Path64 etc. optimizes implicit loops by expanding them into explicit ones, and then making sure that the regular optimizer does a great job on that.
So does GFortran, FWIW. That being said, there is a lot of work to that is done before this lowering pass (as I'm sure you know, but for other readers who might not be as familiar with Fortran and compiler internals). In particular, the Fortran array expression semantics are that the RHS is evaluated before assigning to the LHS. So a simplistic approach would require allocating a temporary array every time. So there needs to be quite a lot of logic to handle various special cases in order to avoid the temporary arrays.
>Nonetheless, it's awesome that a language that's been around for 60 some-odd years is still evolving and making use of modern programming techniques.
It's interesting that I keep hearing functional techniques referred to as modern despite the fact that many of them date back to the 50s in LISP, in some cases (first class functions) back to the 30s in the Lambda Calculus. Sometimes it really does take a long time for good ideas to spread.
To use a simple example, one might need to transform an array of statistical samples into p-values. The traditional way would be to 1) create a new array of squared deviations to calculate standard deviation (also using intrinsic MAX) and 2) create a new array of p-values by iterating through the deviation array or the sample array. Using this library, one would 1) map the samples with a deviation function, 2) map the deviations with a square function, 3) foldl to get the average, 4) map the samples with a deviation function, and 5) map the deviations with a p-value function.
The first approach would be significantly faster to compute, but also require a multiple times more lines of code than the second approach.
The map can be combined together with short cut fusion, so a map to create new array, get the avg, and another map to get p value
further 'combination' is theoretically possible but I dont know if they will be done.
Nice work. But as far as I can see this only works for the intrinsic Fortran datatypes, i.e. the various kinds of integers and reals. Any chance for extending this to user-defined types? Looking at the code, all functionality has been repetitively implemented for each datatype -- I hope this has not been done manually, but with a code generator? If so, could the generator be extended to process user-defined datatypes?
It's really unfortunate that modern Fortran has no facilities for generic programming. That's why I'm personally moving more and more to Julia...
Looking at this makes me wonder if there is IDE support for handling the way that Fortran does generic functions? So much boilerplate and repetition required...
Unfortunately, Fortran standard does not provide good support for generic programming so some level of repetition is necessary :/. I don't see this as a major issue though - boilerplate coding happens once. In any case, pick any language that supports generic programming, and if you dig deep enough, you will run into boilerplate, in one form or another. :)
To clarify I'm familiar with how Fortran handles generic programming and my frustration with it extends well beyond having to write interface blocks (although these are annoying too). I regard having several versions of functions handling e.g. different length floats as pretty tedious repetition - that in my experience can quickly turn into a maintenance nightmare. Maybe this is outside the scope of what an IDE could help with though. Could some kind of function templating be possible in future versions of Fortran?