I've been doing data analysis in Python for the past 2ish years and I think it's a great choice. Before Python I worked in Matlab, which simplifies matrix operations at the expense of making everything else terrible.
The main other contenders here are R and Mathematica, both of which will fail you when you need do something that isn't strictly statistical/mathematical. Python gives you predictable decent performance and the NumPy ecosystem is awesome for numerical libraries. I've never come across a machine learning library nearly as well designed as scikit-learn and pandas dataframes are a lot snappier than R's equivalents. My only gripe is the paucity of good plotting libraries (matplotlib is impoverished and ugly compared with R's sexy plotting routines).
Now, I haven't said a word about the faster statically compiled languages: C, C++, Java, C#, F#, OCaml, Haskell, etc...
The trouble with static languages is that they either lack essential libraries or don't allow for rapid prototyping (or in some cases, both).
Now, if you're implementing the heart of a numerically intensive algorithm and your code can't be decomposed into a few already implemented primitives, it makes sense to write it in C. The first thing to do, though, is to wrap that native code with a Python interface and test it from python.
There will be very nice tools rolling out for Haskell early this fall. :-).
(I can't elaborate much because I'm busy writing theM presently, but stay tuned and I think y'all will like what you'll see when the public release lands)
(one sexy hint though: the value add of these works in progress is enough that I'll be able to hire folks full time to work on it with me starting mid September or October . )
The main other contenders here are R and Mathematica, both of which will fail you when you need do something that isn't strictly statistical/mathematical. Python gives you predictable decent performance and the NumPy ecosystem is awesome for numerical libraries. I've never come across a machine learning library nearly as well designed as scikit-learn and pandas dataframes are a lot snappier than R's equivalents. My only gripe is the paucity of good plotting libraries (matplotlib is impoverished and ugly compared with R's sexy plotting routines).
Now, I haven't said a word about the faster statically compiled languages: C, C++, Java, C#, F#, OCaml, Haskell, etc...
The trouble with static languages is that they either lack essential libraries or don't allow for rapid prototyping (or in some cases, both).
Now, if you're implementing the heart of a numerically intensive algorithm and your code can't be decomposed into a few already implemented primitives, it makes sense to write it in C. The first thing to do, though, is to wrap that native code with a Python interface and test it from python.