Show HN: Pandashells – Bringing the Python data stack to the shell prompt

robdmc · on Aug 2, 2015

Pandashells is a set of cli tools that opens up the power of the python data stack to the bash prompt. Brings dataframe manipulation, visualization and statistical modeling to the unix pipeline workflow.

sciurus · on Aug 2, 2015

Since you created this tool, you can put "Show HN:" at the beginning of your submission's title. That will make it show up on https://news.ycombinator.com/show

robdmc · on Aug 2, 2015

I'm a total HN noob. Don't see an option for editing the post. Am I just blind?

eevilspock · on Aug 3, 2015

A lot of people will miss this since it's posted on a weekend. If this doesn't make the Monday morning HN front page, I recommend waiting a week or so and reposting as a Show HN some weekday morning.

danso · on Aug 3, 2015

No...after the first X minutes, the option to change the title is lost. Maybe a mod will change it though.

Cogito · on Aug 3, 2015

...and you can always send an email to the mods to ask them to do this - hn@ycombinator.com

They tend to reply quickly and are very helpful, in my experience!

jackmaney · on Aug 2, 2015

> There is no requirements file with pandashells because some of the tools only require the standard library, and there's no sense installing unnecessary packages if you only want to use that subset of tools.

This is a terrible idea. One of the main features of package managers is dependency management. If you want to be as minimalistic as possible, separate the tools into further packages (some of which have their requirements laid out properly, and some of which only require the standard library).

mixmastamyk · on Aug 2, 2015

Try the extra capabilities parameters in setup.py, so the user can selectively install what they'd like:

    extras_require = {
         'param_name': ['pkg1', 'pkg2'],
     },

They then can be used like so:

    pip install package[param_name]

robdmc · on Aug 2, 2015

thanks mixmastamyk. That's a good suggestion.

robdmc · on Aug 2, 2015

Thanks for the feedback. As you may know, getting matplotlib to work properly with backends like TkAgg can be quite a chore when installing using pip (non-python libraries are needed). The Pandashells install process could be improved. I intend on creating a conda package for Pandashells that should handle the dependency issue you raise, but I just haven't had time to do so yet. I am hoping the detailed examples in the docs will suffice until I do.

jackmaney · on Aug 3, 2015

Why focus on TkAgg support? Just write graphics to a user-specified file.

robdmc · on Aug 3, 2015

A native-ish backend like TkAgg is valuable because it instantly pops up an interactive plot, which I find really useful for the quick-n-dirty data exploration tasks I usually use Pandashells for. You don't need the interactive backend and can use the --savefig option on plots to save images or html, but that takes time -- and I usually want my results now!! :)

jackmaney · on Aug 3, 2015

It should be possible to pipe to file and then fire off a subprocess to open the resulting png.

In Windows, you can do:

    os.system("start {}".format(output_file_name))

to open the output file with the default Windows program to view that file type (although if there is no registered program to open files of that extension, you'll get the usual prompt asking you what program you want to use to open the file.

Similarly, you can use `open` in OS X and `gnome-open` in at least some flavors of Linux (including Ubuntu).

Of course, you could always allow this command to be user configurable, with the defaults above depending on the OS in which the code is running.

et2o · on Aug 2, 2015

This looks amazing! The amount of time I spend writing short python scripts to manipulate data in simple ways is way too big, and this looks like it might fit my needs precisely. Thank you for creating this.

undergrowth54 · on Aug 3, 2015

This is great! Do you want to make a Conda package for this or shall I?

robdmc · on Aug 3, 2015

I would love help making a conda package. I've never done that before. I didn't spend a lot of time looking, but I didn't find conda packages for some of the dependencies (e.g. gatspy). If you now how to work around that, I'd welcome the help.

wakatana · on Aug 3, 2015

Thank you very much for your work. One question: I've never used PowerShell cause I'm UNIX user but isn't PowerShell capable of doing something similar (piping the objects)?

robdmc · on Aug 3, 2015

I'm not sure. I've never used PowerShell either.