Hacker News new | past | comments | ask | show | jobs | submit login
The spreadsheet as a minimum viable CMS (medium.com/jkalven)
122 points by snowwrestler on Oct 16, 2015 | hide | past | favorite | 39 comments



I built the content for an entire news app with just Google Sheets: http://projects.propublica.org/sopa/

I imported the list of Congressmembers from Sunlight Foundation (which is itself a spreadsheet [1]...then I wrote a script to pull campaign finance data from OpenSecrets using the unique identifiers in the Sunlight sheet [2]. I believe I used the NYT's Congress API [3] to get term data and votes-with-party percentage...though at the time I hadn't known about how to get the data from www.govtrack.us (which has bulk downloads of bill/vote data via rsync)

Then for the rest of the data (news items that featured a given Congressmember), I just researched manually and entered it in by hand into another sheet. I eventually imported everything into a database to make it a Rails app because that's the only way I knew how to build an app at the time...but the laborious part, the research and data entry, was made possible through the use of a spreadsheet...I didn't have to spend time building an admin interface that, no matter how well designed, would have almost been certainly klunkier than using Google Sheets.

The tradeoff is that you have be disciplined in your data entry process...i.e. unique identifiers have to be spelled consistently, as you don't have the ability to enforce constraints or enumeration (well, not without writing a lot of custom JS to run inside of Google Sheets) the same way you do with databases. This isn't too hard if you're working by yourself and you're proficient with keyboard shortcuts (Cmd-C, Cmd-V, Cmd-Tab, particularly)...but it's not easy to bring other people into the project, ad-hoc.

[1] https://sunlightlabs.github.io/congress/#legislator-spreadsh...

[2] https://www.opensecrets.org/resources/create/apis.php

[3] http://developer.nytimes.com/docs/congress_api

Today I would most definitely not have moved it to Rails, as it was a relatively small dataset and the app didn't need anything besides static pages. I now do almost all my medium-to-small apps in Middleman, which is a slightly more complicated version of Jekyll (basically, you can execute Erb instead of being restricted to Liquid). Usually I start off with a spreadsheet, but if the dataset is small enough, I'll record the content in YAML.

https://middlemanapp.com/

edit: added mention of the NYT API


"The tradeoff is that you have be disciplined in your data entry process...i.e. unique identifiers have to be spelled consistently, as you don't have the ability to enforce constraints or enumeration"

This is a really good point and one I should have included (I wrote the piece above). There is no easy way to validate syntax and spelling in the spreadsheet and we would often find ourselves hunting around, trying to find the typo that was breaking a particular project. We obviously got better at avoiding this situation as time went on, but it made the learning curve all the more difficult for new users.


Let me point out that Excel has had data validation in spreadsheets for years. It is trivial to enforce basic data validation without writing any code.


Pivot tables can help there. If you pivot on color, you will more easily see that lone "grey" between the "gray"'s.

You can also copy-paste-as-values a known-good one, and compare what you have know against what you had last week.

And of course, you can enforce an enumeration in Excel. See https://support.office.com/en-us/article/Apply-data-validati...


When I attempted a similar setup (Middleman + Google Sheets) I was quickly cornered into using fuzzy string matching to load data successfully. I ended up adding unplanned data validations that enabled the app to suggest corrections, but in retrospect I wish I'd done more validation in the spreadsheet itself for a tighter feedback loop early on.


Have you looked at Webhook?

http://www.webhook.com/


Been doing that for some small sites with Google Sheets, Jekyll and Netlify (https://www.netlify.com).

Wrote this plugin for Jekyll that'll let you use a Sheet as a data-source when building the site:

https://github.com/netlify/jekyll-gdrive

Netlify can run Jekyll builds with custom plugins (unlike GitHub pages) and you can setup an inbound webhook to trigger a build.

Once the webhook for building the site is in place, you can add a script like this to the Google Sheet:

    function triggerBuild() {
      var url = "BUILD_HOOK_URL_HERE";
      UrlFetchApp.fetch(url, {method: "POST"});
      Browser.msgBox('Your site is being updated now. Changes will be live in a minute.');
    }
And assign it to an image of a publish button.

Now content editors can work in the spreadsheet, and press "Publish" to trigger a new build and deploy :)


The first "dynamic" website I ever built, circa 1998, was powered by an Excel spreadsheet. It was for an online gaming league, and I needed a fast way to create HTML pages with the various league tables on them.

The way I built it was to represent my information in spreadsheet cells, then add a final column with a formula that concatenated together a string of HTML using the content of the cells in that row to create a <tr><td>...</td><td>...</td></tr> chunk of HTML.

Then I just had to "fill down" the formula, then copy and paste the resulting lines into the "PASTE HTML HERE" section of my HTML page and FTP it up to the server.

It worked surprisingly well. So much so in fact that I reused the same technique with a Google Sheet for a small internal web page just a few weeks ago.


Not a user, but no mention of AirTable? https://airtable.com/


Wonder why they chose to make individual elements rather than a HTML5 video for their animation.


This is pretty cool. I currently work at a place that builds a similar solution but for mobile apps effectively using Excel, or Google Sheets as the CMS. It's called AppSheet (https://www.appsheet.com) We've managed to make the "translation" from tables to a mobile paradigm quite efficient. but I see the biggest challenge in applying logic (that many people use in the spreadsheet itself) to the app. Very interesting to see how people use similar solutions. We are moving to Expressions to solve for logic limitations which would make it more robust for an advanced spreadsheet user.


Love Appsheet! My wife uses it for her nonprofit New Incentives.


"One thing we valued about that system, for instance, was the birds-eye view it offered of the story and its underlying components."

Possibly of interest as November looms...

http://emmadarwin.typepad.com/thisitchofwriting/2010/05/help...

Not a spreadsheet as such but a similar time-line based planning tool.


Another pattern you can use (and we do for one of our apps) is to make a google sheet with all the information you need and then write a python script which pulls down the information and transforms it into the json/xml/whatever you need for your relevant application. It should validate it at this time as well.

This allows non-technical team members to add/edit information that the technical team can then import in validated batches.


I maintain Grille, a simple Node.js CMS backed in Google Spreadsheets. It's primarily intended for use with games and tabular content, but building a blog with it wouldn't be too hard.

https://www.npmjs.com/package/grille

Data is loaded into memory, and reloaded at the call of a function, so lookups are quite fast.


Nice one, I made something initially very similar to what I see you made with grille in drive-db[1]. However I changed from an 'update the data when you call a function' to 'set a maximum-life for the data' format now as I just found myself making that new version manually in every project. The name will be changed from `timetout` to `cache` though as the name is confusing even for me.

When do you normally update the data? Are you using it in any public-facing project?

Also a couple more of differences I've seen:

- I store it in a file then retrieve from file or remote depending on the last time it was retrieved. This gives a mixed performance locally, but remotely not so much as server-server is quite fast. I might try storing in memory as you though, that should be way faster

- I use a mongodb-like syntax for finding, which allows for (I think) simpler use, but your syntax for sure allows for simpler debugging as you can see the data 'as-is'.

- Grille allows for more flexibility, but it also looks more complex. So our demo files are completely different [2]

So basically every advantage has a disadvantage (:

[1] https://github.com/FranciscoP/drive-db [2] https://docs.google.com/spreadsheets/d/1fvz34wY6phWDJsuIneqv...


I'm on a project at the moment where google spreadsheets is the starting point for all content that goes into our database.

Steps go something like this:

1. Scrape data from various places into CSV 2. Import CSV to google sheets 3. Manual clean / Visually inspection by human. They backfill any missing information 4. User scripts for google sheets (like geocoding addresses) 5. Point python script at sheet URL an import data to Postgres 6. Done

This workflow works extremely well because we outsource some of the data cleaning/backfilling on Upwork. These days everyone understands spreadsheets, so there is very little training involved.


Nice writeup, and congrats on the launch of Stacker.

It seems like any app that needs some flexibility in the data model evolves to contain a spreadsheet-like UI. A lot of CRMs in particular head in this direction; look at RelateIQ or Streak.

The app I'm working on, Fieldbook, is also spreadsheet-inspired. Not a CMS (yet) but it is good as a tracking tool (for tasks, recruiting, investor conversations, etc.) Still in private beta but here's an invite for Hacker News folks: https://fieldbook.com/?bc=HN0816


Nice app. I like the relational model that links two sheets. Simple.


Slightly OT, but it seems like this company has several people on staff (the article mentions at least 3 developers, plus content people, plus the guy writing - so at least 5, maybe even 10 or so).

Is it possible to sustain a business of that size on a tool as niche as this one? You need $1mln revenue a year at least just to stay afloat, and that's if you're in a not very expensive area, which would put you far away from your clients. How many customers could such a product have, and how much would they pay per year? I can't see how the numbers could work.


Marc Canter [1] [2] once pointed out to me in an accidental moment of candor that a Director timeline was just a BASIC program turned 90 degrees counter clockwise, so the line numbers increase from left to right.

[1] http://www.wordyard.com/dmz/canter.html

[2] https://en.wikipedia.org/wiki/Marc_Canter


I heard once about someone setting up a ghetto CMS by hosting a static site on Amazon S3 and then having it pull dynamic content from the Google Docs "Sheets" API somehow. Then you could edit Google spreadsheets to fill it in.

Ghetto as hell but it worked and is almost free to host.


What makes a technology "ghetto"?


The definition I'm aware of is being hacked together (rapidly and cheaply) and flimsy, as opposed to engineered and robust.

Not to be confused with something that is "engineered" (as in, plenty of resources are dedicated to it) yet still flimsy, like so much of the software we know.


In certain circumstances it means an inelegant but effective and cheap workaround, so basically one of the sub-meanings of "hack"


Being inexpensive and fast apparently..


Not quite a CMS but have you guys checked out https://gingkoapp.com/? It looks more like org mode in browser and helps edit lists.


I once used Google Forms and Sheets to let users add data for a website (for which I had no control) and show the data with a userscript. You can even send simple sql in the url and get only the data you need.


Same logic, but many times more productive : http://www.prismic.io


I certainly use Excel to maintain a few wiki pages on my work intranet - generating most of it from sheets using VBA. Works pretty well.


I was always kind of surprised that apps that use MS Office as a front-end for managing online data never really became a thing. Given how many gazillions of people are familiar with Office, and how much you can accomplish behind the scenes with tools like VBA it always felt like something that would inevitably emerge someday. But it never really did.

Maybe the problem was that gazillions of people are familiar with Office, but none of them actually like it? Or as we might term it, the "Lotus Notes Problem."


Excel has that functionality, and has had it for a long time. It is used a lot, at least I've seen it used a lot, for internal functions such as workflow MIS (on wall mounted displays no less, that no one knew was Excel as we'd hidden anything Excel-like, like grids and menus / title bars, etc -fantastic to hack a quick solution, for example a visitor was coming, and we'd switch to a one-off sheet "Welcome XXX visitor" for 20 seconds through the day, that hack took 5 minutes, super easy) and general data functions.

As well as connections to conventional databases, SharePoint integration is a big thing - have a little macro that does a refresh (and a second one that mimics the movement of a mouse so the mandatory screensaver doesn't turn on).


The problem could be the VBA hurdle. It's enough of a hurdle that you would need an actual developer to go beyond basic functionality. At that point, a developer would just recommend you use whichever hammer the developer is holding or is most interested in at the moment.

The exception to this seems to be MS Access. I have come across multiple situations in which business users end up using MS Access for applications. I don't know if it was set up for them by developers initially, but it was clear to me that they continued to enhance it based on their needs without developer assistance.


I used to do a lot of Excel programming, and I found the environment to be extremely unstable. I've had spreadsheets with thousands of entries with data that would just become corrupted or erased for no discernible reason. It really turned me off from using it as a regular tool.


True, but that's not really what I was getting at -- I meant, just using Office as a front end for managing data that's actually stored somewhere else, like in a real database that cares about things like "consistency" and "durability."


I was doing a lot of this 12-14 years ago, building Access VBA front-ends to remote data sources in Oracle or MySQL (not PostgreSQL because its ODBC driver was practically unusable) and using Word and Excel for reports and publishing.

It was a fast and easy way to build a fairly complex GUI, much better than anything we have for the web. It was also very stable—except for COM Automation, which was slow and error-prone (e.g. having Access drive Word for a mail merge.) Many of these applications are still in use to this day.

What killed it for me was having the code stuck in the GUI builder rather than in text files. This meant that source control and testing demanded a lot of painful manual drudgery (and discipline, which not all the developers working on these applications had.)


I failed to find any links in the article to these visual stories that the author talks about creating. Are they still online?


Why not use a wiki for this purpose? It seems to me that it is a more natural fit.


Hey, this is Josh Kalven (I wrote the piece above). One of the benefits of the spreadsheet, over a wiki, is that the numbered row structure could match our numbered slide format. This made it very easy to keep everything organized, particularly when assigning transitions, images, sources, etc. to each individual slide.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: