Interesting. I'd also like to see compiled a curated list of standard Fortran libraries, that is commonly used and hopefully academically peer reviewed and part of prominent published work.
Scientific programming in fields that have significant Fortran legacy is often a matter of writing new code to build upon or adapt existing experimentally verified code for new studies.
An annotated library of notable code and details about it with references to publications could be very valuable.
Unfamiliar to most developers, scientific programming has the quirk that the more code you write, the more your experiment becomes about the code instead of your actual thesis. This is one of the reasons Fortran stays relevant, decades of code and experiment built up as a foundation for something new.
I’m not sure if that holds. Every university I’ve ever done research in has no issues paying for software like MATLAB, ANSYS, Mathematica, Intel MKL, Intel Compiler Suite, and the like. The people I know that still use Fortran (including myself) are all heavily focused on computational numerics and the fact that ifortran is closed source vs gfortran being open just does not even factor into the equation. If it’s available to me, and I can show that a program compiled by ifortran and MKL is faster than one compiled by gfortran and BLAS then I will use the faster one.
I left that world a long time ago, but what held is there was no problem buying site licenses for big software, not as much "I'm a research assistant and I'd like to buy X for $Y,000 for my project" The difference being buying something with broad appeal used by lots of people within the university, and buying something that I'd like to try for my project. Especially when there's already an established method which doesn't involve spending your PI's grant money.
You must have done research at well funded universities :)
My experience is quite different. Although after some hassles I was able to get license but it’s online only.
Overall I’m trying to move away from commercial software (I don’t teach any, only use foss) but yeah Mathematica, MKL + ifort are unbeatable in some respects (Matlab is slowly going to be replaced except some specialised toolboxes, don’t use ansys)
Anyway, proprietary and closed source software is in my opinion harmful for academia. Sometimes a lot, sometimes a little but in general it’s no good, although I understand that it funds some development that wouldn’t be done otherwise.
Not sure how I could proprietary/closed software being 'harmful'. If the product is superior and allows researchers/students to do a better or more efficient job then why not (absent severe budget constraints)? It is highly unlikely that those end users would ever modify the code or even need to view the source.
The fact that universities splurge money on these proprietary tools rather than spending it on long-term improvements isn't a good thing, even without the pain of actually supporting them.
The typical differences between stuff using ifort/MKL and gfortran/OpenBLAS/BLIS are at least similar, and probably smaller, than the sorts of variation you see anyhow on HPC systems, even without the important large-scale optimizations.
The company is distinct from their various products. For instance, their POP services are gratis. The NAG library is proprietary but, when I last knew, probably the majority of it was based on code from Netlib etc.
Ho, ho. I've been told by a NAG representative that we should pay them to talk to our own academics who actually do the serious linear algebra! Also I was told they were contracted to do the old AMD proprietary BLAS, which was inferior to OpenBLAS, and has been dropped in favour of BLIS. I don't wish to slight the general quality of their stuff, though. Their Fortran compiler is notable in this context.
You don't get the large scale parallel libraries from NAG anyway.
> We all know what the quality of academic code reputation is.
do 'we'?
considering a huge amount of common numerical software (esp with any kind of fortran lineage whatsoever) is based in some way on netlib.org code that itself was heavily developed in the BSD UNIX on Vax+ARPANET era within the academic/research community, I'm not really sure that, based on this comment, 'we' do..
Writing your own crypto is a bad idea, but in research sometimes the goals of your research require you to become good enough, and by good enough, I mean a notable expert in the "writing your own crypto" equivalent. Which in this case would be something like writing your own linear algebra library functions for research you plan to publish.
Non-trivial experience in research computing support says that gfortran is surprisingly more reliable than ifort. "Just use GCC and openmpi" has worked in enough cases for me. A number of HPC benchmark lists I've seen have had failures in the proprietary lines and not GCC's.
The performance advantages are largely mythical too. The last time I ran the Polyhedron benchmarks with beta versions of ifort and gfortran on SKX with the best obvious overall optimization (profile-directed), the bottom line was insignificantly faster with gfortran. Gfortran is infinitely faster on ARM and POWER too.
I wish research councils would support the free software.
GCC has made large improvements in the last 10-15 years in fortran support and performance. Memory is long and sometimes you just learn things as a beginner which remain "true" regardless of the changing facts.
Intel Fortran was far superior in language support and binary performance in the mid 2000's and a researcher couldn't be without it.
Whatever the reason, people shouldn't be propagating the incorrect information, especially if they're in research, and if they're interested in performance, which requires measurements. The same thing needs saying over and over here and elsewhere.
I don't remember figures and dates, but g77's performance was at least reasonably competitive with proprietary Unix compilers once GCC's backend was sorted out for the architecture (scheduling in particular). It was also mostly more correct. Observationally, researchers could do without ifort, especially on non-x86 hardware -- although ifort morphed from the DEC^WCompaq compiler which was used on our Alphas. (A good deal of work was done for g77 specifically because of problems at the time with portability and availability of compilers for a high profile computational project.)
If Fortran has been around for this long without a standard library, then what do people use now? Does each compiler have its own? Are they very different? Is there much of a need for this since it hasn't existed for 60+ years?
Much is built in to the language spec, which then becomes a definition debate about what constitutes a standard library, not a useful debate in this context.
There are many popular "standard" libraries that people use like BLAS and LAPACK and domain specific code for say, mesh generation.
Also, most Fortran code is used for numerical calculation, most of what you need for that is already there.
As for compilers, there are still regularly released Fortran standards. The first proposal for what became Fortran was in 1953! The first compilers came a few years later. This was still on punch cards, and details of the syntax still remain from that era. Most of the standards are real standards and come with a date, Fortran 66, Fortran 90, etc. There have also been many branded releases tied to compilers. There are definitely differences in compilers on the edges, and the results you get can differ from one to another. But mostly they are the same with some compilers adding features specific to themselves.
Is there a need for this? Well there is some very commonly done linear algebra which you either have to write yourself or pick out a library, so it does make some sense to have these things standardized and included which would ease some burden.
Here's a reference to built in functions for a specific compiler:
> If Fortran has been around for this long without a standard library, then what do people use now?
The actual language, for example most of the functionality that in C is considered to be in the standard library (like I/O operations, mathematical functions) is implemented in the language in Fortran's case.
If the language is not enough for your particular case, you can use third party library like BLAS, LAPACK ...
In my experience as a researcher in computational numerics, the “standard library” is going to be defined by the in-house code that I use. That code is going to use other scientific libraries like for example PETSc, Atlas/blas/mkl, MPI in the background, but the typically researcher isn’t going to be exposed to that because their focus is on setting up the equation Ax = b and writing the algorithm to solve it for x.
Language has basic string type (character array, really) and basic file access. People would use NetCDF or HDF libraries for reading and writing big numeric datasets. Algorithms, people would write their own when needed. For linear algebra, BLAS, LAPACK and (sometimes Intel MKL) is de facto standard, it's just not called a standard library.
It's probably worth pointing out that LAPACK/BLAS is a strictly Fortran77 interface, which isn't ideal. Unfortunately there's no de facto standard modern interface available.
Fortran has some aspects of a standard library already, though compared to other languages the shelves are rather bare, let's say. It wasn't until Fortran 2003 that there was a standard way to get command line arguments. Before then different compilers had different ways to do that.
Most users of Fortran are only looking to crunch numbers. That's their application, and for that application Fortran already comes with everything you need, including a large standard library of mathematical functions and operations. I would argue that many users do not have much of a need for anything beyond multidimensional arrays, to be honest. I only know two people who have implemented linked lists in Fortran, for example. That said, it would be nice if Fortran had a good standard library. I would appreciate if Fortran had assertions, for example.
How many people are writing new Fortran code? I know it has its niches, and is used in some popular numerical Python packages, but what about outside that? I've seen it mentioned in a few job postings for government contractors or "scientific programmer" (more of a PhD scientist who happens to program) roles, but typically listed in a way that makes it look like an HR person took a list of the tech a company used in the last few decades and make them all listed as requirements.
I'm writing new Fortran right now, actually. I know a lot of people who still write Fortran, though all of them work in scientific or engineering fields.
I view Fortran as only good for number crunching, but for that purpose it is difficult to beat in many respects. Fortran is very common in supercomputing applications, though C and C++ are also very common. I know a lot of people who view Fortran as antiquated, and I understand their criticisms. I prefer other programming languages in general, but all programming languages are just tools and Fortran is a good tool in some circumstances.
The vast majority of time on research supercomputers is used in running Fortran codes. (Click on 'languages' https://www.archer.ac.uk/status/codes/ , this is the UK national facility.)
And yes, as you've noticed almost all linear algebra is written in Fortran (BLAS, LAPACK, ScaLAPACK), so no matter what language you've written something to do some useful computation in, it's likely that most of the cycles are spent in Fortran!
When I did a PhD in theoretical/numerical physics, I wrote most of my code in Fortran. That was started ~5 years ago.
Basically, I was starting to port some some Matlab code I had written to C/C++ to speed it up, but the most convenient solver for the differential equations I was dealing with outside of Matlab turned out to be a Fortran library. So I started learning basic Fortran in order to write a small C++ wrapper around that library, but ended up liking Fortran much better than C++, so I kept using it throughout my PhD.
Note that modern Fortran (2008+) is not at all like the infamous Fortran 77. There's no reason to write GOTO spaghetti; you have object-orientation, built-in support for complex numbers and matrices, array slicing, pointers, polymorphism, pure functions, vectorized ("elementary") functions and subroutines, and other modern features. There are even some unique features like the "associate" keyword that I sorely miss when doing mathematical programming in any other language (mostly Python these days). Because e.g. the built-in arrays "know how big they are", error messages are typically more helpful than debugging numerical C++ too.
It's used quite widely in bioinformatics, at least in Australia. Over the last ten years I've written numerous web apps for scientists to provide user-friendly interfaces to their fortran command-line programs.
People in meteorological community still pretty much use fortran for everything, from input data processing, the atmospheric models themselves, to visualization. They already did distributed computing with fortran decades before kubernetes is a thing. Same with air pollution modeling (which is a subset of meteorology but handled by environmental engineers). I used to dabble with those models when I was studying environmental engineering.
I have not kept up with recent developments, but Fortran has historically been noticeably faster than even C/C++ for numerical computations. This difference used to be quite stark due to limitations in C itself: http://www.ibiblio.org/pub/languages/fortran/ch1-2.html (see " b) The design of FORTRAN allows maximal speed of execution:") I think things have gotten better in C, e.g. with C99's "restrict" keyword, along with general improvements in C compilers, but for a long while Fortran had a clear upper hand when it came to allocating and manipulating large arrays of floats. Outside of the language specification, I would wager that many supercomputers have better-optimized Fortran compilers than C compilers for this reason. And even if in principle C and Fortran can execute equally performant numeric code, Fortran's historic superiority has created a great deal of high-quality legacy Fortran code that nobody feels an urgent need to port into C (which would be tedious, expensive, and risky work).
In the modern day: R uses a standard Fortran BLAS implementation, as do many other libraries and platforms (NumPy, Numerics.NET, etc). LAPACK is also widely used for low-level numerical linear algebra: https://github.com/Reference-LAPACK/lapack Intel also maintains a BLAS implementation. So there's still a healthy need for a (small) community of Fortran programmers, it's not all about maintaining 70s legacy code. Very different than the situation with COBOL.
If you use the reference BLAS routines that ship with R, you're losing an order of magnitude serial performance on relevant operations. (The majority of Intel MKL surely isn't written in Fortran, like OpenBLAS and BLIS.)
> Fortran has historically been noticeably faster than even C/C++ for numerical computations
Nowadays: faster - maybe, noticeably - I doubt it.
> supercomputers have better-optimized Fortran compilers than C compilers
I don't know about supercomputer compilers but mainstream compilers usually have the same backend for FORTRAN and C (as well as other implemented languages).
> created a great deal of high-quality legacy Fortran code that nobody feels an urgent need to port into C
Optimized and tested FORTRAN code - maybe, but not high-quality. I've seen some of it, FORTRAN makes it difficult to
write readable, maintainable code. For this reason
even scientist are rewriting their tools and libraries
(that also require good performance) in C++: for example see
Pythia, GEANT, cern root.
The thing is, you can write a perfectly normal fortran code, and instantly gain speedup (CUDA, distributed computing with OpenMP, etc) just by enabling some compiler flags. You can't do this in C/C++ as you have to deliberately write your program to use those tech. Also, vector/matrix operations are first class in fortran and you don't need to rely on 3rd party libs.
> The thing is, you can write a perfectly normal fortran code, and instantly gain speedup (CUDA, distributed computing with OpenMP, etc) just by enabling some compiler flags.
I'm not sure I understand you correctly. Can you give examples of such flags?
> Also, vector/matrix operations are first class in fortran and you don't need to rely on 3rd party libs.
It may be useful as long as you're hell-bent on not
using libraries (which is somewhat contrary to one of the
pro-FORTRAN arguments that FORTRAN has lots of libraries that are tested and ready to use).
This is a weak consolation though, since anything complex enough deals with custom matrix/vector types for sparse matrices or data types used in parallel computations.
Not sure about gfortran, but commercial fortran compilers supports automatic parallelization (e.g. intel fortran compiler -parallel flag [1]). You can even go as far as parallelizing you program across a cluster of machines via OpenMP by simply sprinkling some directive in your program to mark the code that must be parallelized via OpenMP. I remembered incorrectly about cuda. PGI fortran compiler supports CUDA but you still need to deliberately use it in your code, though there are projects that attempt to make this automatic (not sure if they're really took off).
> It may be useful as long as you're hell-bent on not using libraries (which is somewhat contrary to one of the pro-FORTRAN arguments that FORTRAN has lots of libraries that are tested and ready to use).
Yes, library is still used but it's typically only for data input/output. For example NetCDF is a popular data format and many fortran projects support the format via 3rd party library. But for complex matrix computation, this is essentially what fortran was made for so it's not typical to use 3rd party library for this. Most big fortran projects in the area I was involved with (meteorology and air pollution) uses minimal amount of 3rd party library and mostly rely on built-in fortran functionality, with optimization being left to the compiler (typically intel or pgi fortran). There is definitely code reuse, but it's in the form of the scientist collecting snippets of useful algorithm over the years and copy it to the project when they needed.
On a side note:
having (semi)automatic parallelization with code generation
for GPGPU would be very nice.
> There is definitely code reuse, but it's in the form of the scientist collecting snippets of useful algorithm over the years and copy it to the project when they needed.
Well, doing complex matrix calculation yourself in C/C++ without 3rd party library is hard. Unless you write everything yourself or specifically use intel MKL library, the benefit of enabling automatic parallelization on C/C++ won't be as impactful as in fortran where it's common do all calculation without using any 3rd party math library.
Could be, at least C++ has tools to implement better design.
I've seen one guy's python code that looks worse than his FORTRAN code.
There's a russian saying: "A true FORTRAN programmer can write FORTRAN code in any language".
One fun way to evaluate potential applicants for working on a C/C++/Fortran compiler backend is to ask them about Fortran. If they say that they appreciate how it makes it easier to optimize things, then you're probably talking to an experienced engineer.
We're writing it for the ground processing for data from a new space telescope. Personally, I wish I was writing C++ or Python, but it does allow scientists to start writing code much more easily than C/C++ does, though Python is now more familiar to younger scientists than Fortran. Python has the downside of a lack of stability, however.
Fortan is pretty painful to write in. The lack of an STL means writing algorithms yourself. There is no good set of libraries, so you have to write more stuff yourself.
The string support is horrendous. Most of the language is based around clumsy fixed size strings. I wish they would add a new flexible string type.
The compiler quality is not up to C++ standards. My code currently has some nasty workarounds for problems in gfortran. It's not good to keep discovering compiler bugs.
I also find it's much vaulted numerical capabilities overstated. Vectorization only works in simple cases. You can't drop down to using SIMD wrappers to fix this. It's nice to be able to pass arrays around more easily than C, however. A good C++ array library would be better, but that would require everyone to agree on one.
When I learn fortran years ago, I quickly gave up on string manipulation and custom file parsing. It just not worth it to struggle with string and file manipulation in fortran. Instead, I simply read or dump the array directly into raw file or netcdf file and wrote some python script to prepare input files for the fortran program or read and visualize the output.
Nowadays intel is not a reference for C++; the versions found on HPC compilers are not the fastest nor the best following the latest standards.
I don't think commercial/open source is the key here: on the fortran side we have been long suffering from a lot of bugs/regressions with the current versions of intel fortran (with respect to the "latest" features like OOP) -- I would even say that they could be more than we are finding in gfortran.
An explanation could be that they are investing time in supporting their big customers that likely use ancient code bases.
Exactly, that is my point: so we are mainly still in the land of Fortran95, which is not designed for taking advantage of the latest features of the hardware.
Do they have libraries for JSON, network functions, strings, data types (e.g. dict/map, resizeable vector, sorting, linked list, kd-tree...), databases, and all the things that make a programming language useful? Please point me to the library. I'm only aware of numerics.
I guess that's a no then. Well, I better give up on Fortran if I need to load configuration files, sort data, expand an array, take command line parameters, associate some text with the data, or use a hash table, if those aren't reasonable features. Apparently, it's only acceptable to use fortran to load some matrix, process it, then spit something out into a file.
Most of the fortran codes I'm working on would be much simpler with a common libraries to do things like that.
There is a cottage industry that actually do just that: create a wrapper program with nice gui to prepare data file input and read/visualize data file output from an open source fortran model. For example, aermod is an open source air pollutant dispersion model developed by EPA and there are a bunch of commercial GUI wrapper for it because (as you note it) you can't expect scientists without computer science background to compile, prepare data and parse the output of a fortran program. Those thing are hard to do in fortran as a programmer, let alone as a scientist without computer science background.
See for example MESA http://mesa.sourceforge.net/index.html an open source set of modules for software experiments in stellar astrophysics, which is still actively developed.
I wonder about the same thing. When I graduated in 1985 Fortran was already described more or less like you just did. However, when I recently dug down through the NumPy/SciPy stack to understand some details I saw a lot of Fortran code. Some popular algorithms seem to exist only as Fortran with various language interfaces on top.
So I don't think there are very many new projects being started but the presence of so much critical code in Fortran will extend its lifetime some more.
I am one of the authors. We will pre-generate the files into a tarball, so that end users do not have to worry about generating it or have Python installed. Also we'll probably rewrite the preprocessing tool to C++ for robustness and speed. The reason we use it is that we need some easy way to generate the subroutines to work with any array dimension. There are not many other good options.
Reading this comment, one has to wonder what its actual relevance to the content is. Is the code of conduct somehow more important than the actual library itself?
I would say burfog thinks so. It is basically another license on the code and we have seen decades of debate on those. Never mind patent grants in license or as a separate document. CoC is another social contract and will probably be of increasing importance as cases of their use happen.
Could you explain what is cancel-culture Code of Conduct? I looked at it and didn’t find anything that would let railroading people out of the project so I’m curious what is wrong with it
It would be fine if all people were reasonable and uninterested in causing political commotion to score a win. Of course, in that case there would be no conduct issues to deal with.
So when you read it, don't think "Of course it wouldn't be used that way." (same as when a law is passed) For a person with malice, that Code of Conduct is a very effective weapon.
People will claim offense, sometimes even 3rd-party offense over things said long ago or outside the project. Others will follow the CoC maliciously or, at best, robotically. The political fight takes over the mailing lists, factions form, and actual productive contributors leave or are tossed out.
Scientific programming in fields that have significant Fortran legacy is often a matter of writing new code to build upon or adapt existing experimentally verified code for new studies.
An annotated library of notable code and details about it with references to publications could be very valuable.
Unfamiliar to most developers, scientific programming has the quirk that the more code you write, the more your experiment becomes about the code instead of your actual thesis. This is one of the reasons Fortran stays relevant, decades of code and experiment built up as a foundation for something new.