Hacker News new | past | comments | ask | show | jobs | submit login
Embedded Python/NumPy in MonetDB (monetdb.org)
61 points by espeed on Jan 23, 2017 | hide | past | favorite | 10 comments



Pushing calculations to the database is a great idea, good to see it gaining more traction. Too often people pull back "all" the data just to iterate it locally.

I looked at MonetDB as part of researching column databases in general. It was as fast as the commercial databases on some queries, I was really impressed. The main thing that put me off was that you couldn't seem to nudge data layout / the query optimizer to do exactly what you specify. They seemed to have the idea of an all-knowing system that would optimize for you, perhaps as a research idea that works but for commercial use it worried me too much.

For anyone interested in column databases in general I put this list of comparisons together: http://www.timestored.com/time-series-data/column-oriented-d...


The list seems out of date, InfiniDB has been open-sourced since Calpont went bankrupt, and Greenplum is open-sourced under Apache 2 license[1]. There's also MariaDB Columnstore, a fork of InfiniDB[2].

1. http://greenplum.org/

2. https://mariadb.com/products/mariadb-columnstore


Is anyone using MonetDB production? How does it compare to other column stores?


You can do this in Greenplum too.

I've always liked the idea of moving general compute to the database, but usually in an organisation the databases are pretty tightly locked away and it becomes a pain to get the right versions of your libraries installed.


I've never hear of this database, but it looks sort of interesting.


See this classic Google Talk by Peter Boncz, the creator of MonetDB (https://en.wikipedia.org/wiki/MonetDB)...

MonetDB/X100: a (very) fast column-store https://www.youtube.com/watch?v=yrLd-3lnZ58


It's nice if you're looking for an alternative to a paid columnar database (ie Vertica, Redshift). I used it on a side project and it performed surprisingly well for large queries (think sum/group by across tens of millions of rows).

Not sure how it works on a production system.


It's over 15 years old.


Still, it suffers from a lack of visibility. I'd only heard about it because one of our DB vendors in the past had mentioned integrating with it.

That said, it's had a lot of interesting development in the last few years (e.g. ocelot which tries to hardware accelerate operations via OpenCL).


yea, I noticed that when I looked at the site. I've still never heard of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: