Hacker News new | past | comments | ask | show | jobs | submit login
The interesting ideas in Datasette (2018) (simonwillison.net)
86 points by Tomte on April 4, 2022 | hide | past | favorite | 15 comments



Oh wow, I wrote this way back near the start of the project. I've been planning to write an updated version, and maybe incorporate that into Datasette's official documentation.

Most of the ideas in there are still used by the core project, with the exception of one: the "Far-future cache expiration" item.

That's the thing where Datasette rewrites the URL to incorporate a hash of the database contents, then serves them up with far-future cache expiry headers.

It used to be default behaviour, but I changed that when Datasette started allowing plugins to make writes to databases in February 2020 https://simonwillison.net/2020/Feb/26/weeknotes-datasette-wr...

A few weeks ago I removed the remains of that feature from core entirely, and instead made made it available in a Datasette plugin: https://datasette.io/plugins/datasette-hashed-urls

That 2018 blog entry also talks about " Bundling the data with the code" - I've since expanded that idea into something I call the "Baked Data" architectural pattern: https://simonwillison.net/2021/Jul/28/baked-data/


I use datasette almost daily (thanks simonw!) for ad-hoc queries and for validataion.

Often I'll have two APIs from different systems, which I may or may not have access to directly. Typically I'll need to denormalise the data and dump it to some csvs, use csvs-to-sqlite (from the datasette ecosystem) to convert to a db and open up datasette. I've discovered errors that had remained hidden for months (different teams etc etc) using this workflow.

Also, because the queries are url-encoded I'm considering using it as a 'fake' backend for some proof-of-concepts backed by a couple of cron jobs running the pipeline


Related:

The interesting ideas in Datasette - https://news.ycombinator.com/item?id=18141571 - Oct 2018 (11 comments)


I don't get Datasette. I had a small dataset I wanted to use with it, hoping it would be some sort of moderately advanced FOSS BI software, but all it did was give me a table view.

I haven't managed to find a use case for it.


I'd really love to hear more about your use-case and expectations here.

The feature I use most often for exploring new data is facets - I explain those in the tutorial here: https://datasette.io/tutorials/explore#using-facets

By default Datasette doesn't include visualizations, but those can be added as extra plugins. Two of the most popular are https://datasette.io/plugins/datasette-vega for charts and https://datasette.io/plugins/datasette-cluster-map for geographical data, e.g. on https://global-power-plants.datasettes.com/global-power-plan...

In terms of use-cases: I've recently started asking people who use it what job they hired it to do. One of the leading answers is for data publishing - they use it as a tool to quickly publish data online for others to use:

    datasette publish cloudrun mydatabase.db --service my-database
https://docs.datasette.io/en/stable/publish.html


Thanks for replying, Simon! I guess my use case was a bit of a toy, I don't have a huge dataset to use facets on, I basically just made a small spreadsheet of blood tests and wanted to see if I could easily navigate the data better than what Excel did. I did install Vega, but it had basically one way to display data (I didn't look into it too much, though).

I think my use case was just not suitable for it, from your answer I guess it's more for people who have huge datasets they want to publish. I imagine it will be much better suited for that purpose, though I tried the demo a few times and couldn't really think of any interesting questions to ask it, so I didn't get it there either...

Datasette is one of those things that sounds very promising, so I want to explore it, but I end up walking away every time. It just sounds like it's not for me, though.


Yeah for that kind of data Excel will definitely be more useful.

Datasette starts getting interesting with data that's too large for Excel - or data with multiple tables where SQL joins start coming into play.


That makes sense, I'll see if I can find a very large dataset I'm interested in to try it, thank you!


I had the same vision last year and after talking with Simon and other contributors, I came up with the Datasette Dashboards plugin: https://github.com/rclement/datasette-dashboards

It is still very alpha but usable if you know Vega syntax.

Looking for some contributors to bring Datasette to the level of Metabase!


I hadn't checked this out in quite a while and WOW - it is looking so good! That demo really is beautiful: https://datasette-dashboards-demo.vercel.app/-/dashboards/jo...



Thank you Simon, this is truly appreciated!



Oh that looks great, thank you!


Maybe it would be interesting to replace Sqlite with DuckDB, since both of them are in-process databases, but Sqlite is OLTP and DuckDB is OLAP. I wager most interesting queries will be better served by an OLAP database, like facets: https://simonwillison.net/2018/May/20/datasette-facets/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: