Hacker News new | past | comments | ask | show | jobs | submit login

It's the libraries like NumPy, Pandas, SciPy, Matplotlib, SymPy, and so on that make the difference. There's been a ton of work put into developing those Python frameworks for "data science" which don't exist to such an extent in languages like Lua.



A lot of those actually started as wrappers to existing C/C++/other libs, something that is very easy to achieve in Python and not so much with other runtimes.


To be sure, NumPy is the third implementation (Numeric, numarray, NumPy) of the basic array data type, so getting it right isn't as easy to achieve even in Python.


And when library X and unrelated library Y are wrapped, they often both use Numpy arrays, making them interoperable.

E.g. if you do geographical work with raster data, there is GDAL, a raster library with Python wrappers. Because it exposes the raster data as Numpy array, you can easily make images from them to overlay on maps with matplotlib, or analyze them using scipy. In C/C++ it would probably be a lot more hassle to combine them like that.


I don't think the libraries are the only reason python has a strong following -- after all, perl also has a huge collection of libraries. I realize there's a healthy dollop of subjectivity in all this, but here are what I think the reasons for Pythons success.

1. Python uses words where many other languages use symbols. In addition, the words Python uses are simple and clear. Instead of '!', '&&' and '||', python has 'not', 'and' and 'or'. Often constructs that look like function calls in other languages look more similar to english in python. For example: if 'Python' in names: print("Found it!") .

2. Python uses indentation to control scoping. The merits of whitespace sensitivity are of course open for debate, however, I think it pushes the code closer to what pseudocode looks like, and this is probably a good thing.

3. Python has a simple and coherent object model. Unlike Lua or Javascript, python has a real object system (i.e. not prototype-based) with inheritance. Although object oriented programming doesn't get a lot of love on HN, it does have several benefits, compared to, for example, functional programming. One advantage is that it arguably well-understood and knowledge about OO has already been widely disseminated. Another advantage of OO is that the transition of using structs to hold mere data (a la C or Pascal) to OO is a simple matter of adding function to the struct. With OO, you can start with code that is more procedural, and work your way towards a design that is more OO. This is a path that makes sense in the context of scientific programming, since one often starts with a short programming that, say, implements an algorithm in a single function. (Then you build on it in an OO way by allowing loading input data, etc.)

5. Numerous well-defined, easily understood, and easily accessible customization points. For example, in Python, allowing the user to write "x + y" is as simple as implementing the "__add__" magic function for the class in question. Ability to overload the mathematical operators is critical for a language used in a scientific context, and this for example essentially eliminates Javascript. It also eliminates lisp in all its variations due to its lack of infix syntax (short of writing custom reader macros). The story is similar with decorators.

6. Python is easy to extend and embed. For some reason, everyone seems to rave on about how great Lua's C API is. Frankly, having written C++ extensions for both Lua, Python, and Java, the one I found easiest was Python's (especially -- but not necessarily -- using boost.python), followed by Java (especially using JNA), and Lua I found quite painful due to the explicit manipulation of the VM's stack.

7. Others have mentioned it, but the libraries are obviously a huge part of Python's success. As others have talked about it, I won't say anything more about it here.


6 is a very good point - Lua has advantages for embedding, but its API is not one of its better points.

Of course, when coding in C, your options are limited - Python's reference-counting isn't much fun either! But code using the Python API tends to be relatively readable, even though you run the risk of forgetting to put a decref in. Lua API code on the other hand, for all that it doesn't need anything like Python's incref/decref, has a tendency to be rather inscrutable.


but why were these written in python as opposed to ruby.


Timing and focus.

People in scientific computing started using Python the mid-1990s. This included Numeric (ancestor to NumPy) which Jim Fulton, Jim Hugunin and others started in 1995. Ruby 0.95 wasn't released until the end of that year. Moreover, van Rossum tweaked Python so it would be a better fit for matrix computing, such as multi-dimensional slices.

Then there was PyFort by Paul Dubois at Lawrence Livermore National Lab (1999), and SWIG by Dave Beazley at Los Alamos National Lab, which had Python support by 1998 (see https://web.archive.org/web/19981212033200/http://www.swig.o... ). Those made it much easier to access existing scientific libraries through Python modules. (A phrase at the time was that Python would 'steer' the low-level high performance code.)

While at this time, Ruby was just becoming known in the English speaking world, and didn't really hit the mainstream until Ruby on Rails in 2005. This means Python had a 5-10 year head start, and Ruby hasn't caught up.

http://programmers.stackexchange.com/questions/138643/why-is... covers a reasonable and diverse set of explanations.


Lua has Torch.


Ok thanks for pointing that out, I stand corrected. Regardless I think Python has reached a sort of critical mass of data science packages.


Numpy existing (more than) a decade ago helped with getting mind share.


OK, now compare that to "literally everything else" in python.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: