This looks like a really useful library. A few questions: 1. What size limits/pr...

pudo · on Nov 12, 2013

Author here.

As for 1.: CSV files are encoded as a stream, so they can be as large as needed. JSON is dumped as a whole from memory, I'd be keen to see if someone has written a streaming JSON encoder.

2.: Consuming, no. I normally load them in a browser with D3 or jQuery to feed them into a graphic or other interface.

3.: I'd argue this is out of scope for dataset, but simpler REST API makers would definietly be cool. Check https://github.com/okfn/webstore - this is what dataset came out of, and it makes somewhat RESTish APIs.

yesbabyyes · on Nov 12, 2013

I'd be keen to see if someone has written a streaming JSON encoder.

This looks interesting: https://gist.github.com/akaihola/1415730

Edit: dataset looks like a really interesting library!

aet · on Nov 12, 2013

I'm curious of the relative advantages/disadvantages over something like sqlalchemy..

piqufoh · on Nov 12, 2013

From an end-user point of view, SQLAlchemy relies on you first defining your models in the ORM (object relational mapping) and then SQLAlchemy will take control of issuing the SQL to create, update and drop tables depending on your interactions with your Python ORM models.

From what I can read, it seems that this cool looking tool allows you to use SQL as a kind of object free data store, maybe not unlike a NoSQL DB python wrapper (freeing you from first defining your models, and then ensuring that the SQLAlchemy functions have updated your DB).

dragonwriter · on Nov 12, 2013

Since tables are created and modified on insert commands, there doesn't seem to be any possibility of maintaining integrity at the DB level. That would seem to be the main disadvantage compared to any approach that uses schema defined in advance. You still get an RDBMS advantages for ad hoc queries, but not integrity.

MPetitt · on Nov 12, 2013

Well it seems like this is heavily based upon the progress of sqlalchemy based on the shoulder of giants comment at the bottom of the page. Whether that is in a philosophical way or a technical way, I haven't looked into it enough to find out, but it would be nice to know the comparative differences and similarities.