How many people are writing new Fortran code? I know it has its niches, and is used in some popular numerical Python packages, but what about outside that? I've seen it mentioned in a few job postings for government contractors or "scientific programmer" (more of a PhD scientist who happens to program) roles, but typically listed in a way that makes it look like an HR person took a list of the tech a company used in the last few decades and make them all listed as requirements.
I'm writing new Fortran right now, actually. I know a lot of people who still write Fortran, though all of them work in scientific or engineering fields.
I view Fortran as only good for number crunching, but for that purpose it is difficult to beat in many respects. Fortran is very common in supercomputing applications, though C and C++ are also very common. I know a lot of people who view Fortran as antiquated, and I understand their criticisms. I prefer other programming languages in general, but all programming languages are just tools and Fortran is a good tool in some circumstances.
The vast majority of time on research supercomputers is used in running Fortran codes. (Click on 'languages' https://www.archer.ac.uk/status/codes/ , this is the UK national facility.)
And yes, as you've noticed almost all linear algebra is written in Fortran (BLAS, LAPACK, ScaLAPACK), so no matter what language you've written something to do some useful computation in, it's likely that most of the cycles are spent in Fortran!
When I did a PhD in theoretical/numerical physics, I wrote most of my code in Fortran. That was started ~5 years ago.
Basically, I was starting to port some some Matlab code I had written to C/C++ to speed it up, but the most convenient solver for the differential equations I was dealing with outside of Matlab turned out to be a Fortran library. So I started learning basic Fortran in order to write a small C++ wrapper around that library, but ended up liking Fortran much better than C++, so I kept using it throughout my PhD.
Note that modern Fortran (2008+) is not at all like the infamous Fortran 77. There's no reason to write GOTO spaghetti; you have object-orientation, built-in support for complex numbers and matrices, array slicing, pointers, polymorphism, pure functions, vectorized ("elementary") functions and subroutines, and other modern features. There are even some unique features like the "associate" keyword that I sorely miss when doing mathematical programming in any other language (mostly Python these days). Because e.g. the built-in arrays "know how big they are", error messages are typically more helpful than debugging numerical C++ too.
It's used quite widely in bioinformatics, at least in Australia. Over the last ten years I've written numerous web apps for scientists to provide user-friendly interfaces to their fortran command-line programs.
People in meteorological community still pretty much use fortran for everything, from input data processing, the atmospheric models themselves, to visualization. They already did distributed computing with fortran decades before kubernetes is a thing. Same with air pollution modeling (which is a subset of meteorology but handled by environmental engineers). I used to dabble with those models when I was studying environmental engineering.
I have not kept up with recent developments, but Fortran has historically been noticeably faster than even C/C++ for numerical computations. This difference used to be quite stark due to limitations in C itself: http://www.ibiblio.org/pub/languages/fortran/ch1-2.html (see " b) The design of FORTRAN allows maximal speed of execution:") I think things have gotten better in C, e.g. with C99's "restrict" keyword, along with general improvements in C compilers, but for a long while Fortran had a clear upper hand when it came to allocating and manipulating large arrays of floats. Outside of the language specification, I would wager that many supercomputers have better-optimized Fortran compilers than C compilers for this reason. And even if in principle C and Fortran can execute equally performant numeric code, Fortran's historic superiority has created a great deal of high-quality legacy Fortran code that nobody feels an urgent need to port into C (which would be tedious, expensive, and risky work).
In the modern day: R uses a standard Fortran BLAS implementation, as do many other libraries and platforms (NumPy, Numerics.NET, etc). LAPACK is also widely used for low-level numerical linear algebra: https://github.com/Reference-LAPACK/lapack Intel also maintains a BLAS implementation. So there's still a healthy need for a (small) community of Fortran programmers, it's not all about maintaining 70s legacy code. Very different than the situation with COBOL.
If you use the reference BLAS routines that ship with R, you're losing an order of magnitude serial performance on relevant operations. (The majority of Intel MKL surely isn't written in Fortran, like OpenBLAS and BLIS.)
> Fortran has historically been noticeably faster than even C/C++ for numerical computations
Nowadays: faster - maybe, noticeably - I doubt it.
> supercomputers have better-optimized Fortran compilers than C compilers
I don't know about supercomputer compilers but mainstream compilers usually have the same backend for FORTRAN and C (as well as other implemented languages).
> created a great deal of high-quality legacy Fortran code that nobody feels an urgent need to port into C
Optimized and tested FORTRAN code - maybe, but not high-quality. I've seen some of it, FORTRAN makes it difficult to
write readable, maintainable code. For this reason
even scientist are rewriting their tools and libraries
(that also require good performance) in C++: for example see
Pythia, GEANT, cern root.
The thing is, you can write a perfectly normal fortran code, and instantly gain speedup (CUDA, distributed computing with OpenMP, etc) just by enabling some compiler flags. You can't do this in C/C++ as you have to deliberately write your program to use those tech. Also, vector/matrix operations are first class in fortran and you don't need to rely on 3rd party libs.
> The thing is, you can write a perfectly normal fortran code, and instantly gain speedup (CUDA, distributed computing with OpenMP, etc) just by enabling some compiler flags.
I'm not sure I understand you correctly. Can you give examples of such flags?
> Also, vector/matrix operations are first class in fortran and you don't need to rely on 3rd party libs.
It may be useful as long as you're hell-bent on not
using libraries (which is somewhat contrary to one of the
pro-FORTRAN arguments that FORTRAN has lots of libraries that are tested and ready to use).
This is a weak consolation though, since anything complex enough deals with custom matrix/vector types for sparse matrices or data types used in parallel computations.
Not sure about gfortran, but commercial fortran compilers supports automatic parallelization (e.g. intel fortran compiler -parallel flag [1]). You can even go as far as parallelizing you program across a cluster of machines via OpenMP by simply sprinkling some directive in your program to mark the code that must be parallelized via OpenMP. I remembered incorrectly about cuda. PGI fortran compiler supports CUDA but you still need to deliberately use it in your code, though there are projects that attempt to make this automatic (not sure if they're really took off).
> It may be useful as long as you're hell-bent on not using libraries (which is somewhat contrary to one of the pro-FORTRAN arguments that FORTRAN has lots of libraries that are tested and ready to use).
Yes, library is still used but it's typically only for data input/output. For example NetCDF is a popular data format and many fortran projects support the format via 3rd party library. But for complex matrix computation, this is essentially what fortran was made for so it's not typical to use 3rd party library for this. Most big fortran projects in the area I was involved with (meteorology and air pollution) uses minimal amount of 3rd party library and mostly rely on built-in fortran functionality, with optimization being left to the compiler (typically intel or pgi fortran). There is definitely code reuse, but it's in the form of the scientist collecting snippets of useful algorithm over the years and copy it to the project when they needed.
On a side note:
having (semi)automatic parallelization with code generation
for GPGPU would be very nice.
> There is definitely code reuse, but it's in the form of the scientist collecting snippets of useful algorithm over the years and copy it to the project when they needed.
Well, doing complex matrix calculation yourself in C/C++ without 3rd party library is hard. Unless you write everything yourself or specifically use intel MKL library, the benefit of enabling automatic parallelization on C/C++ won't be as impactful as in fortran where it's common do all calculation without using any 3rd party math library.
Could be, at least C++ has tools to implement better design.
I've seen one guy's python code that looks worse than his FORTRAN code.
There's a russian saying: "A true FORTRAN programmer can write FORTRAN code in any language".
One fun way to evaluate potential applicants for working on a C/C++/Fortran compiler backend is to ask them about Fortran. If they say that they appreciate how it makes it easier to optimize things, then you're probably talking to an experienced engineer.
We're writing it for the ground processing for data from a new space telescope. Personally, I wish I was writing C++ or Python, but it does allow scientists to start writing code much more easily than C/C++ does, though Python is now more familiar to younger scientists than Fortran. Python has the downside of a lack of stability, however.
Fortan is pretty painful to write in. The lack of an STL means writing algorithms yourself. There is no good set of libraries, so you have to write more stuff yourself.
The string support is horrendous. Most of the language is based around clumsy fixed size strings. I wish they would add a new flexible string type.
The compiler quality is not up to C++ standards. My code currently has some nasty workarounds for problems in gfortran. It's not good to keep discovering compiler bugs.
I also find it's much vaulted numerical capabilities overstated. Vectorization only works in simple cases. You can't drop down to using SIMD wrappers to fix this. It's nice to be able to pass arrays around more easily than C, however. A good C++ array library would be better, but that would require everyone to agree on one.
When I learn fortran years ago, I quickly gave up on string manipulation and custom file parsing. It just not worth it to struggle with string and file manipulation in fortran. Instead, I simply read or dump the array directly into raw file or netcdf file and wrote some python script to prepare input files for the fortran program or read and visualize the output.
Nowadays intel is not a reference for C++; the versions found on HPC compilers are not the fastest nor the best following the latest standards.
I don't think commercial/open source is the key here: on the fortran side we have been long suffering from a lot of bugs/regressions with the current versions of intel fortran (with respect to the "latest" features like OOP) -- I would even say that they could be more than we are finding in gfortran.
An explanation could be that they are investing time in supporting their big customers that likely use ancient code bases.
Exactly, that is my point: so we are mainly still in the land of Fortran95, which is not designed for taking advantage of the latest features of the hardware.
Do they have libraries for JSON, network functions, strings, data types (e.g. dict/map, resizeable vector, sorting, linked list, kd-tree...), databases, and all the things that make a programming language useful? Please point me to the library. I'm only aware of numerics.
I guess that's a no then. Well, I better give up on Fortran if I need to load configuration files, sort data, expand an array, take command line parameters, associate some text with the data, or use a hash table, if those aren't reasonable features. Apparently, it's only acceptable to use fortran to load some matrix, process it, then spit something out into a file.
Most of the fortran codes I'm working on would be much simpler with a common libraries to do things like that.
There is a cottage industry that actually do just that: create a wrapper program with nice gui to prepare data file input and read/visualize data file output from an open source fortran model. For example, aermod is an open source air pollutant dispersion model developed by EPA and there are a bunch of commercial GUI wrapper for it because (as you note it) you can't expect scientists without computer science background to compile, prepare data and parse the output of a fortran program. Those thing are hard to do in fortran as a programmer, let alone as a scientist without computer science background.
See for example MESA http://mesa.sourceforge.net/index.html an open source set of modules for software experiments in stellar astrophysics, which is still actively developed.
I wonder about the same thing. When I graduated in 1985 Fortran was already described more or less like you just did. However, when I recently dug down through the NumPy/SciPy stack to understand some details I saw a lot of Fortran code. Some popular algorithms seem to exist only as Fortran with various language interfaces on top.
So I don't think there are very many new projects being started but the presence of so much critical code in Fortran will extend its lifetime some more.