Hacker News new | past | comments | ask | show | jobs | submit login

This is only partly true, and in a limited way. Fortran is still king for simulations or anything that needs to be fast. Other possibilities are C and assembler, and that's about it. Python + numpy has become very popular very fast for exploring data and the results of simulations, and it's a great environment for this. Note that when you use numpy you're calling fortran (and C?) routines. The fortran languages and compilers continue to evolve, and fortran will probably remain the language of choice for large-scale computation for some time. Other languages, such as Java and lisps, are sometimes used for big simulations, but Python is just too slow to be used for anything but prototyping.



This depends very strongly on the nature of the simulation. Lots of simulations are handled just perfectly by modelling your system as arrays. Consider, for example, nonlinear waves. Your simulation typically consists of FFT -> k-domain operator -> IFFT -> x-domain operator (repeat for i=0...T/dt).

No reason whatsoever not to do this in python, though of course the FFT is just fftw/fftpack and the x/k domain operators are numpy ufuncs (all written in C/Fortran).

On the other hand, for particle simulations, you need more complicated logic/data structures to handle multipole methods, so the python array model might not work so well.


Yes, it seems that there are some types of simulations that could be structured as numpy operations steered by a little Python code. But can this kind of code be run effectively on large multi-processor machines?


Array operations tend to be highly parallelizable by nature. Numpy operations can certainly be parallelized, and even distributed. Take a look at blaze.

https://github.com/ContinuumIO/blaze


That looks interesting. I know that numpy calculations should be straightforwardly parallelizable; my question, out of curiosity, was whether they were in practice.


Note also that a common bottleneck for array processing is memory bandwidth. Multithreading something that is memory bound will not get you much speed.

There are tons of optimisation, new representations that can be experimented with for arrays. While NumPy is already reasonably fast, I am convinced you can get much faster by expanding it (within or outside it). String/Object arrays nowhere near as useful as they could be as well.


numexpr (a tool used to optimize performance of numpy code) has support for parallelizing operations. See http://code.google.com/p/numexpr/wiki/MultiThreadVM


With the growing use of LLVM to compile numeric Python down to code nearly as fast as C there should be a lot of new opportunities to replace old Fortran code. Albeit this will probably happen over the next 5 years or so.


> Python is just too slow to be used for anything but prototyping.

Hardware time is often cheaper than engineer time, so Python may be faster/cheaper if you consider total time to value.


In the scientific computing environments that I have in mind, hardware is often fixed: you have your several-million-dollar supercomputer on site, or a fixed compute budget at a supercomputer center. Now, do you want your result in one day or 100 days? Because that's the compute time ratio we're talking about.


Not really, no, at least not without some context. Lots of people use python and numpy on very large computers. Also, running time is not the interesting metric: dev + runtime is. The tradeoffs depend on your team, the problem, etc...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: