Hacker News new | past | comments | ask | show | jobs | submit login

What's wrong with matplotlib? I might be living under a rock...



I always find it easier to produce publication-quality figures using gnuplot (but not with its defaults settings, mind you) than with Matplotlib. Check out http://gnuplotting.org/

Also, it's hard to beat gnuplot's speed refreshing a live scatter plot with many thousands of points using the x11 terminal.


I'm using gnuplot for plotting too (the actual gnuplot application not a library that uses gnuplot as its backend).

And I usually keep computation and plotting separate. Computation produces data files, and a gnuplot script generates plots. This separation of computation and plotting allows updating charts later if needed, collected data can be reused in other plots, and additional data analysis can be performed and charts can be augmented.

So I personally don't see many advantages from integrating chart generation into computational pipeline itself (except for computation monitoring or maybe when user response is needed to direct computation). Because of that, libraries that encourage charts generation from a computed array instead of dumping that data into persisted files feels like an anti-pattern to me.


Completely aeree. I keen computation steps (which create csv files) separate from charting steps. I use make to orchestrate pipelines. I also keep everything under source control, and insert git commit ids into every chart. This ensures that all the analysis and charts can be linked directly to the code used to produce them.


Somewhat agree but sometimes there is need to change/filter the data that goes into making the chart which is only realized after plotting it. Combining data and the figures into one "pipeline" makes it easy to iterate especially with exploratory data analysis. Regardless, this comment made me think about my general workflow which is usually combined. Appreciate this comment.


Matplotlib isn't very friendly to casual users.

For even the simplest possible plot, I have to create a subplot and axis.

Sometimes I'd like to just plot a function. I don't want to initialize arrays for that.

It's easy to forget that I have to `import matplotlib.pyplot`

I don't need to plot things often, but whenever I use matplotlib, I always have to spend a few minutes to look up how to use it.


>For even the simplest possible plot, I have to create a subplot and axis.

So? Can't that be abstracted away once in a custom lib, of the 3-4 plots you use 99% of the time, and be done with it?

In which case, you just need to pass in your data and labels, in a specific format, and that's it.


> So? Can't that be abstracted away once in a custom lib, of the 3-4 plots you use 99% of the time, and be done with it?

This does not beat gnuplot's simplicity where you don't even need to define that.

The following line is a complete gnuplot program to plot the sine function:

    plot sin(x)
Every parameter of the plot has reasonable defaults, and you can redefine all of them as you wish.


"that can be abstracted away in a custom lib"

Yeah, and that's really not helpful for sharing code or doing exploratory charting. It's never, ever as simple as just "being done with it".

vega-lite-api is my charting library of choice these days. Much simpler than gnuplot, d3, matplotlib, etc.


import pylab as pb

x = np.arange(0, 10, 0.01)

pb.plot(x, np.sin(x)

pb.show()

what do you mean hard?


This is a minor source of confusion. Pylab and Pyplot are packages within Matplotlib. They are what most casual users experience when they say that they're using Matplotlib. I use them, they're convenient.

A minor headache is when you have to break out of Pyplot to use some of the more detailed behaviors of Matplotlib, and now you're interacting with both Pyplot and the lower level calls. For instance, plt.title('foo') and gca().set_title('foo') do the same thing.

If you're a fluent programmer, you fly past those seeming inconsistencies with barely any notice. Explaining them to a novice programmer is harder.


Matplotlib has been working fine for me. Some caveats. I'm a physicist working in industry, and don't publish in academic journals. But I do a huge amount of data visualization, for my own use, and to produce graphs for internal reports.

The graphs look as good to my eye as what I see in papers, but I have no idea what extra steps are needed to satisfy each journal's style guide.

Before Matplotlib, I created graphs in Excel.

A possible question is whether Matplotlib deserves the status of being the default for teaching scientific programming in Python, or if a different tool would make it easier for beginners.


Nothing, really. I have been using matplotlib for years and it's... fine. The only problem I have is that is has number of minor annoyances that are never getting fixed, despite being well known and the project actively maintained. From the top of my head: the Tk backed not supporting DPI scaling on GNU/Linux; the aspect="equal" not working on 3D plots; covered parts of 3D objects appearing in front of the objects covering them, twin axes not having the origin aligned, etc.


Matplotlib is what is wrong with matplotlib..

Broadly, the problem is that its syntax was meant to reflect that of Matlab, which, I guess, makes it intuitive for Matlab users. For the rest of us, it's mostly unintuitive and inconsistent.


Can you give an example of its inconsistency?

I would be genuinely interested what are the inconsistency from a Python-perspective.

I am former Matlab/Octave user. To me the julia interface of matplotlib is actually quite nice to use, but unfortunately the installation is a bit brittle.


Somewhere you can use c="k" for colour but sometimes you cannot, you must use color="k".

Some settings only exposed to figure class but not the ax class. And when you are doing stuff towards ax class, you must write

ax.set_ylim(0,3)

instead of

ax.ylim(0,3)

Matplotlib.pyplot is known for its nonsensical api.


From the top of my head, the one that annoys me regularly is the difference between setting a title/x or range on plot, and on a subplot. So plot.title(), and ax.set_title().

By the way, who came up with the idea that an axis object is a great handle to handle subplot settings..?


I think matplotlib feels natural to people who use(d) Matlab, but not necessarily to others.


    from matplotlib import pyplot as plt
WTF? Broken from right there. What's a plt? "o" key broken? Why didn't they just call it pyplot? Why not just

    import pyplot
    pyplot.plot(lambda x: math.sin(x))
    pyplot.plot(x=[0,1,2],y=[0,2,4])


People who are making plots in the terminal don't want to type out the fully qualified library name. Majority of plots are written and read only once during data exploration and analysis time.


The library is not forcing you to use plt as a shorthand. Nothing is stopping you from calling

from matplotlib import pyplot

pyplot.plot([0,1,2], [0,2,4])


Yeah I know. It's just that they've created an ecosystem of "plt" and that makes me want to use the entire library less.


It's a convention of the math-heavy python libraries, numpy and pandas also often get imported as np and pd.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: