Hacker News new | past | comments | ask | show | jobs | submit login
Excel as a database (wyorock.com)
162 points by coldtea on April 8, 2013 | hide | past | favorite | 108 comments



I really wish I couldn't say this literally, but I've been witness to that comic strip in real life.

A few years ago I was asked to build a database in Excel which would import spreadsheets that clients submitted, accumilate all that data then spit out a whopping great big CSV file every month. And the most insulting thing of all was that I had to build this abomination using VBA macros so I could guarantee that the CSV output was clean enough to be exported into an Oracle RDBMS we also ran. I frequently raised the point that we should be running all of our databases on Oracle, but I was refused every time (I could write a whole essay on their dumb rational behind this, but that's another rant entirely).

So off I set writing one of the most complicated Excel macros of my life (and wondering what I'd done in a past life to be lumbered with such moronic management). I wrote every piece of data cleansing I could imagine knowing full well that the spreadsheets our clients would be submitting are bound to have had their formatting messed around with (and trust me, I tried everything I could to lock that template spreadsheets down). Yet, and without fail, every month at least 10 spreadsheets would be rejected from my macro's import routine because the data was so garbled that it was beyond cleansing.

Usually it was a simple job to fix the spreadsheets - or a simple phone call to the client where I verbally slapped them as much as I dared; vainly hoping that might serve as a warning to them that they shouldn't be cocking about with the layout of the template. Needless to say, it never had any affect.

However this one time I was completely baffled. The spreadsheet look fine at first glance, yet my routine rejected it. Sure, I couldn't click inside any of the cells, but the spreadsheet was locked from editing so that was to expected......or was it?

It turned out that the idiot who submitted that spreadsheet printed out their completed template, then scanned it in on a MFD (a scanner with bells and whistles). Then got that MFD to e-mail them a PDF of the printed page and now scanned page so that they could then embed that back into the fucking template.

All I asked them to do was fill a couple of columns of data then e-mail me the spreadsheet as an attachment. But nothing is too simple when you're dealing with users who think they understand computers.


I remember one user a decade ago who I set up a process for that was basically, "Run this report, verify that it looks OK, then send it there" who was unable to do so. Repeatedly.

Turns out that she was opening a CSV file in Excel, saving it, and then it couldn't be parsed.

It took a surprising amount of training for her to understand that this really was "changing something". When she eventually got it she asked why she was supposed to be able to look at it at all. My reply is that that was the spec, and someone higher up thought she should see it.

She stopped looking at it and the problem went away.


a line of copy I added to a client's website last night

Please do not simply rename the file extension of your file to match the formats we require. They will not open.

Apparently there are plenty of people who thing changing the file extension from .docx to .psd changes to format. Windows (and OSX too, I guess) are partially to blame here as they'll happily change the icon


To give the complete picture though, recent Windows warn you before applying the changes to the file extension.

Although the users would still blindly click "Yes" even if the dialog asked for their firstborn child.


don't get me started...

Now here's an idea for any software developers - grab AutoIt (is it still around?) and make a quick script called "typical_user" that automatically clicks "Yes" (or similar) on any dialogs that appear. Now, run this in the background while testing the UI of your software :)

Edit: Yes, AutoIt still exists! http://www.autoitscript.com/site/autoit/


Or try http://www.sikuli.org/

For automation, it's replaced AutoIt in my toolbox - just in case I get another contract in the future like one of these.


OSX warns you when you change the file extension too, though it doesn't warn that it may not work, it just says it'll open in a different application.


Did you read that EULA then?


You say that as if your users will actually read that. I find that the ones who need the most help don't read anything you've written and expect to follow the process they have in their mind no matter what happens.

I have literally read the message on the screen to someone to solve their problem.


I know it's fun to beat up on people who use their software meant for summing accounting tables as simple databases, but really the alternatives in these small cases usually introduce so much friction to the user and require so much work to get up and running as to not be worth it.

Even worse, most of the "right" solutions eliminate the easy look at dozens of rows that a spreadsheet offers (letting the user do fast unconscious visual comparisons in their own data) vs. showing what's essentially one row out of some table at a time.

Sometimes a simple grid that users can quickly navigate around, enter stuff into and see all of is the best solution.

The problem is that we (the tech industry) seem to only offer two choices, do it a) the wrong Excel way b) do it the huge giant IT initiative way...

I bet there's huge money to be made in a simple spreadsheet-like user interface that can be setup with similar constraints as a database (field types, foreign key constraints etc). That's all it seems both sides really want.


"I bet there's huge money to be made in a simple spreadsheet-like user interface that can be setup with similar constraints as a database (field types, foreign key constraints etc)."

That's exactly what Access does (and Filemaker and literally 1000's of other packages over the last 2 or 3 decades), and still 'real' computer people piss on them and the people using it. And they're too difficult for 'normal' people to use beyond flat tables and forms, so in the end you still end up with a single choking point of one person who knows how these 'applications' work.

The Excel way is in may cases the right way, from a business POV, and the collateral damage here are the developers who have to deal with the quirks of such a workflow.


Companies are in the business of software now whether or not their direct business is making/selling software. Its worth it -most of the time- to have a technical expert on retainer much like they would a maintenance person for any other skilled task they have in times of need.

If we're working in a computer age of technology you can either fight it and fall behind, try to do things, doing them wrong, and fall behind, or suck it up and hire professionals for doing jobs a professional should do.

Sorry for the mini-rant. I'm working for a small company right now as "the software guy" because everyone else here doesn't care about keeping things efficient. I don't blame them, they have their jobs to do and dealing with software sides of things for 2 hours every day isn't ideal nor appealing. But when you have a software expert on staff and you refuse to take his recommendations don't come crying to him when your half-baked workaround backfired.

Maybe the point I'm aiming to make is developing database systems that are full featured yet still usable by people who can't mentally wrangle data and storage is a lost cause. I know we aim to empower users with their data but at times things become too vast and too important for a person without formal experience in large databases to "get their hands dirty" at the risk of blowing everything up.


This is actually something that Access did fairly well. If someone made a nice frontend like it to SQLite you'd end up with a magic bullet for this. Access did very well as a frontend but on the backend it was horrible and unreliable for anything non-trivial. I've actually been thinking about QT's QML for this since it would let you build the interface like access did and has bindings for SQLite. Might be worth doing some real work on rather than just thinking.


Have you seen openoffice base? Not as fancy as access, but it can use a lot of backends. Anything that supports openodbc will work too.


Access can do this too.


Yeah, SQLite is far better technically compared to Access.

It would be an amazing tool.


When I used to be an "IT guy", everything had to be in Excel. I begged to switch to Access. Hell, I even built a few prototypes in Access in my free time to demonstrate how much better it would be (I was still a student).

My point: Over time, I came to appreciate Excel and realize what an amazing program it is. It is accessible to pretty much anyone who knows how to use a computer, and at the same time can basically be used to create mini-programs.

That's the reason you only have the two choices you mentioned: a) do it the wrong Excel way b) do it the huge giant IT initiative way

Also, Excel does provide the ability for both the data type and foreign key constraints you mentioned.


Also, not many people realise, but you can build forms in Excel which makes data entry much nicer


And then someone emails 'the' spreadsheet to someone else, and now you have 2 different versions and no rational way to reconcile them. neither is authoritative anymore, and one probaby calculates the average of a list of phone numbers. (actual example...)

They're great up to a point, and that point is sharing. I'm actually surprised that google docs hasn't picked up more traction for that reason alone. sharing a google docs spreadsheet doesn't cause divergence.


Maybe there should be a book.

Doing the wrong thing the right way - How to use Excel in ways it was never intended.


I am sceptical about your final sentence, are you talking about anything other than the broad brush data validation constraints provided by Excel? If so please give examples.



The problem is that we (the tech industry) seem to only offer two choices, do it a) the wrong Excel way b) do it the huge giant IT initiative way...

And even when b) is chosen, when users log into the multi-million dollar CRM system, the first thing they reach for is still the "export to Excel" button.

I see it as a failure of education. I have no CS background and when I was starting out at my first job, I would love to have known then what I know now about databases and database-driven application programming. I would love to have had a proper class or two about it in school. But I didn't, so all I knew was the bad, wrong, "e-mail the spreadsheet" way. At my old company we had people whose entire job was basically to take a huge Excel file, turn it into a pivot table (basically a simple cross-tab query) and e-mail the result to people periodically.

I love Access forever because it sucked me into programming. I heard that it was a database but I didn't know what that meant and didn't have anyone who could teach me, so I learned it just by poking at it when I was bored at work until it started to make sense. Then I used it to mash two customer data sources together and tie a sales lead tracker to it--basically a quick and dirty replacement for our slow, over-engineered Oracle CRM system that no one actually used. I know now that what I made was also wrong in many ways and would make a real developer recoil in horror, but it was great fun to make and people actually used it. And it got me interested in databases, which got me into making LAMP apps, which got me into Rails, and so on...


In addition to Access, FileMaker might be an alternative here. It's easy to learn, cross platform, has simple and approachable design tools, field types, relations, SQL connectivity and is hands down the best solution when it comes to getting results fast.


See also: YC request for startups, #22...

"22. A web-based Excel/database hybrid. People often use Excel as a lightweight database. I suspect there's an opportunity to create the program such users wish existed, and that there are new things you could do if it were web-based. Like make it easier to get data into it, through forms or scraping.

Don't make it feel like a database. That frightens people. The question to ask is: how much can I let people do without defining structure? You want the database equivalent of a language that makes its easy to keep data in linked lists. (Which means you probably want to write it in one.)"

http://ycombinator.com/ideas.html


That's a good idea, but it's a bit scary.

I've seen people use spreadsheets as word processors. They don't know how to layout a document in a word processor, but this spreadsheet lets them put words wherever they want.

Sometimes people misuse spreadsheets. I guess it'd be really good to have some kind of transition from spreadsheets to more suitable tools. (Perhaps using spreadsheets as examples an metaphors?) This doesn't have to be software, it could be a short course on a website.


I know an accountant who knew how to use word processors but preferred to write essays in Excel. It's definitely peculiar.


I'm still sad dabbledb.com folded up and went to work for twitter:

http://www.youtube.com/watch?v=MCVj5RZOqwY


I just started a customer on Smartsheet.com, it is actually more or less exactly the above. There's webforms for data submission, special column types (contact dropdown lists, checkboxes etc). It's been running for a few weeks only, but so far its been a joy to use compared to the myriad of Excel files that were used before.


There are a lot of really reasonable reasons to use Excel — it empowers the non-technical user, is flexible and extensible, and can easily wrangled into supporting edge cases.

Is it a poor replacement for a real database and application? Frequently, yes. But it frequently isn't realistic to spec & build a new application to support a growing business process.

All in all, I'm not clear why we don't celebrate how empowering Excel can be to many business users, even if it can be frustrating to deal with when it evolves into a core piece of a critical business system.


We don't celebrate how empowering Excel is because it becomes a giant mess in almost no time at all. Then it's dumped on people like us to fix.

Excel is a fantastic tool, no doubt. It's powerful, flexible and quick to knock up. But it scales abysmally. And that's through no fault of it's own because spreadsheets aren't meant to scale, they're meant to convey tabulated data.


Sounds like what you need is a database for inputting data that doesn't require an "application" to be built around it. I guess this is what access was trying to be, but it is to technical for non techies and to non technical for techies.

Perhaps even something that looks like excel but is stricter with it's inputs and some easy way to handle relations.

Even showing somebody rails scaffolding can get you a fair way.


Given the culture of Microsoft, Excel is exactly the tool that its users need it to be. MS isn't going to take chances with a major profit center like Office. I challenge anyone to make Excel as good as it is for all of its users.


I wouldn't expect such a tool to come from Microsoft, could be fertile ground for a startup maybe? The hard part is of course gaining the acceptance of existing excel users which is why I would make it look not too dissimilar from excel.

The complicated technical part is making something that enforces strong schemas and good support for concurrent users without feeling complicated or overly constrained.

It doesn't necessarily have to do everything that excel does, just be good for use cases like "make a customer list that everyone from the CEO to the new temp can use at once"


Unfortunately, computer geeks like us are far more interested in making interesting features than making things discoverable and accessible. It's hard enough to get an OSS dev to make a GUI, let alone a good one.

I think the most prescient point so far has been that Excel quickly gives you a sanity check of you data, because it gives you the full view of days right away. Working on an ERP rewrite right now, this has been our continual issue, making all of the data visible, in ways that the old system never did.


Also known as google wave


>All in all, I'm not clear why we don't celebrate how empowering Excel can be to many business users

Sometimes we do.

Not all times.

This is one of those "not all times". In this instance we lament the misuse of Excel by business guys.


The reasons why people use Excel are actually simple and reasonable:

- it's easy to use

- everyone has it

- it maps nicely to the skills user have (like moving files around to copy data)

- it's flexible

The last thing is important here. The problem with most database applications is that they are inflexible. Users are not allowed to add another column just to put a small note next to a data field. Or add a datum that is needed only in exactly three cases and no-one thought of it before yesterday. Or color up some particular entries and have it retained. Or million other different cases. They can't do this stuff on their own, without having to contact IT department or whoever, which makes the feedback loop much longer, seriously impacting productivity.

Yes, Excel sucks. But everything else sucks much harder, from user's point of view. Give them something as flexible as Excel, and they'll gladly switch.


How to give something as flexible as Excel that is not Excel?

I think that they'll gladly switch is a myth. From casual users I never heard "I sent you a spreadsheet", only "I sent you an Excel". And there is no incentive for them to get better educated in this regard.


> How to give something as flexible as Excel that is not Excel?

This is our problem; if we can't, then obviously Excel is the right tool for the job.

> I think that they'll gladly switch is a myth.

Ok, I think you're right. What I meant by "gladly" was more like "they won't be desperately trying to copy the data from the application to Excel and work the old way".

> Athere is no incentive for them to get better educated in this regard.

AKA. they won't change, because they have no reasonable reason too. So why do we (as developers) seem to expect them to?


Well, buried in your comment is the assumption that the alternative must be equally flexible, and I disagree.

Flexibility is often the cause of data loss at these companies, see "sorting a column but not the whole table" case for example.

People hate constraints, but constraints are exactly what we need to maintain data integrity. If convenience is more important, be my guest -- but I'll quit the moment they're asking me to fix an Excel table. (A bit of an exaggeration, but you get the point.)


There might be hope. Remember the days when people said "I'll Xerox that for you!" or "AIM me!" or when there was Word Perfect and Lotus?

I know it feels like a stretch after MS has dominated for so long, but if I could think of a viable alternative, I would ship it.


Yeah, the flexibility is huge. I actually think the cycle of guys in the trenches building an excel workflow, then having developers come in later and create a "real" db backed app as a replacement is not terrible.

It's similar to an anecdote I read from Larry Wall. At some brand new university, they did not make any paths through the quads. Instead they just waited a year, and paved over the dirt paths that had naturally formed.


Ha ha! Those non-programmers are so dumb! Good thing we're so much smarter and better than them!

(The irony of a three hundred word complaint about medium ignorance being rendered as a comic is the cherry on top.)


The only irony I see here is the medium ignorance you exhibit.

Who said comics are not supposed to be word-heavy? Comics are not just comedic panels or superhero stories.

Especially in the European comics tradition, but in other cultures too, including alternative US comics, all kinds of ratios between words/pictures have been used to great effect.


Yes. But just talking heads is silly.


Well, he's no professional. He got the point across nicely.


Comics not allowed to have words as the primary focus? Try reading Subnormality at some point. http://www.viruscomix.com/subnormality.html


Or Dinosaur comics, http://www.qwantz.com/index.php. Definitely words as primary focus, since the artwork never changes.


MSFT guy here who relates. As a small adjunct project to http://pytools.codeplex.com, we developed some bindings so you can sidestep VBA and use Python with Excel. It works well for what it does. Here's a short video overview: http://www.youtube.com/watch?v=Oi3QKuFugWk . It's OSS/Apache. You don't need VS to use it.


Hello smortaz,

Your post reminded me DataNitro[1] (previously IronSpread), a python-in-excel solution that was featured on HN nearly a year ago[2], and a YCombinator-backed company.

As far as I'm aware it seems closed-source and costly, while your solution is open-source and free. But how does it compare functionality-wise?

At my company we're using using Excel pretty often and python is growing, so using such a tool could be a really convenient.

1: https://datanitro.com/

2: https://news.ycombinator.com/item?id=4090337


Your view of the marketing department expressed in this strip is why they are likely to avoid involving you in their decisions. People can tell when others feel contempt towards them, and it is not a trait that attracts people. There is a reason Jimmy Fallon had a tech-guy skit on SNL for so long.


No, most marketing people would rather not involve anyone else in their decisions. And as we're using comedians as our oracles, check out what Bill Hicks had to say about people in marketing and advertising. It may surprise you, but that is far from an uncommon sentiment. Marketing is, in too many cases, indistinguishable from manipulation, and IMO, worthy of contempt.


Excel as a database, comic strip as a vehicle for 400 words of prose. It's amazing how tools can be adapted.


I believe you mean "abused" not adapted.

I spent most of last week cleaning up and replacing an email-the-Excel-file-around "solution" with one out-of-the-box sharepoint list. Fortunately, the change was mandated by someone with the knowledge to see that the Excel based "solution" was a horrible idea, and the power to enforce the replacement's use. Unfortunately, the data is now at the mercy of sharepoint, at least that problem rests on others.


I don't get why Excel specifically gets so much bashing. It is an excellent tool, a very versatile tool. The problem lies with the users, as always. This hasn't changed. It's just that the customer base is enormous compare to other tools. Excel can do marvels in the right hands. I could use something else because I'm literate with a larger set of tools than my peers in the actuarial world, but again, to what point if I'm the only one who can reproduce what I did. In Excel, it's pretty easy to track back what was done. Excel strength lies in its ability to empower me with tool to produce a reproducible report in a matter of minutes from database output to presentation. Stop complaining about Excel and start teaching to people how to use it properly.


(rant

My training session begins and ends with "Stop using Excel."

because Excel sucks. Blaming "the user" because they don't how to use it properly is no excuse for this bloated piece of awful software. I've been using and abusing Excel for decades and there are too many things that are wrong with Excel. Here are three of my major beefs...

- cutting/copying and pasting, I think I need to press enter to make that rectangle go away.

- sort. Oops, you only sorted the first column, but you didn't notice that, and a week later you notice something isn't quite right with "the data"

- so many things from a UI experience that could be dead simple, but feel like advanced gymnastics

LibreOffice brings no improvement, it simply duplicates the terrible experience of using Excel. And unfortunately the innovations of spreadsheets from the very very distance past are dead: Visicalc, Lotus 1-2-3, Resolve, Wingz, and many others are long gone.

So no I won't stop complaining about Excel. This is a pain point and maybe there is an opportunity to create a killer spreadsheet app / web app that does not require someone to "train" "the user" "how to use it properly", but I am not holding my breath.

)


You bring up lots of good points, but your training would sound a lot better as "start learning Excel."

There is no software you can not shoot yourself in the foot with. There's no point in switching when people will certainly misuse other applications as well.


You are right. The only way to learn is by making mistakes.


I miss when Rory Blyth (original author, I don't know who is cutting and pasting it to their private website at wyorock.com) used to run a blog called http://www.neopoleon.com/

There was some great stuff on there. He stopped blogging in 2008 but it's archived.

http://web.archive.org/web/20080517080006/http://www.neopole...


The first panel reminded me that Access exists. While I've been forced to dive into it on occasion (usually to do excavate some buried data), I can't say I've ever used it for anything productive (unlike Excel).

What's a good & defensible use case for Access? It always felt like a worst-of-both-worlds package both in terms of functionality as well as UI / UX.


I've seen Access used decently as well as abused.

The good: it is actually not a bad tool if you're familiar with database concepts. It even has some nice features, like being able to see a graph of all the pk relationships between tables (and use that graphical tool to create relationships and rules). You can seamlessly switch between list view (rows)/design view (GUI aid)/SQL when you're working on queries, allowing you to test things quickly and get rapid feedback at different levels of detail. It has a not-completely-insane tool for generating custom reports from tables or queries. It is extensible with VBA (trust me, I know -- that's not really a ringing endorsement). And like many MS products it has wizards for common tasks that make them surprisingly painless. And above all, because Access is part of the office suite, a regular user might actually have the program installed and be allowed to use it in a BigCorp or Government setting.

All that being said... Access is abused just like Excel by people who don't really know what it's for. What's worse, you get monstrosities made by people with just enough knowledge of databases to make them dangerous but without the wisdom to consider normalization, maintainability, or anything like that. I have so often been asked to "take a look at" an Access database that was thrown together 10+ years ago by a mildly computer savvy amateur that is still used in production, and without exception the insides are horrifying. (Side note: usually I have had to fight my instinct to take these on as a pet project because nothing good can possibly come of it, and as soon as you change it you own it).

TL;DR -- You already know shit. You might actually be surprised how not-bad Access is. Sadly you are not the person who is using Access in 99.9999% of cases.


I've been able to make some pretty cool things with Access as the backend, and Excel as the frontend. Add a little VBA/SQL and you're looking like a rockstar. Access and Excel provide a versatile infrastructure to quickly prototype different and effective ways of using/presenting your data.


This is a great point too. Excel is hard to lock down and easy for non-computer people to use. Access is easy to lock down and relatively harder to use.

If you know what you're doing, you can use Access (scary program that people don't really know what it's for) to lock down the data and enforce business rules and that kind of thing, then give coworkers Excel spreadsheets with pivot tables from an ODBC connection to the Access db that they can "do their thing" and mess with and email around.

The true worst of both worlds though is when somebody creates an amateur Access db, locks it down so you have to do everything through a 1995-Visual-Basic-looking switchboard, creates horrendous forms with garish colors and giant bitmap images that have no coherent UI... and ... I can't even go on, these are too painful to remember.


I honestly would never recommend Access as a user frontend. It's amazing how quickly Access userforms become convoluted and confusing. The only "FE" development I've done with Access is simple database maintenance tools usually only used by the developer(me) or a trained maintainer.

Also, most people are comfortable with Excel. Access userforms can scare the crap out of some users, but they're able to manipulate Excel just fine.

I was able to whip up a tool for my boss where his direct reports could log the time they spend on a particular project each day (ridiculous, I know). It's a simple Excel spreadsheet with Excels built in calendar selector, and two columns: Project and hours. Clicking a button writes to an Access database, which my boss can now pull the data straight into Excel with a couple canned reports. No one ever sees anything but Excel. I get that this may not be ideal but: 1. Took a morning to get to production 2. Quick user uptake because they're already comfortable with the system 3. Gets the job done, and my boss can still mess around with the numbers in excel all he likes

So there are use cases.

Another one that I've used successfully is utilizing Access as a middle-man to join two discrete systems within a corporation by using the Import Linked Table feature and building a join query. This way, Access does the heavy lifting of mashing two separate datasets together, allowing users to understand relationships instead of spending time trying to jam lines of data side-by-side.

This comment got long...sorry.


I started cracking-up until I realized that I'm going through that right now. Now I'm just sitting here, silent. I need a drink.


Actually, I would still love to have a setup where Excel is the front end with a database it can query and manipulate.


Excel does this with it's built-in data sources. Typically it can connect to any ODBC data source, which includes pretty much any SQL database and most proprietary databases (although in some cases you'll need to buy ODBC drivers. Yuck)

Then, bingo! You've got real, structured, data available in Excel so you can run Pivot Tables, build charts, and filter data to your hearts content. I use it frequently to build read-only data views for people who want to analyse their data in ways they don't know how to do using more native tools.

This microsoft post is for Office 2007, but applies equally in newer versions: http://office.microsoft.com/en-au/excel-help/connect-to-impo...


I used to work for a hospital that used this strategy for almost all of its internal reporting (I was contracted to ease the transition to Business Objects); it can get out of hand quickly.

Every morning between 8:45 and 9AM, doctors and administrators would come in, fire up Excel and refresh their data, the slowdown on the network was noticeable (pretty much a 'select *' on a massive de-normalised reporting tables hundreds of times throughout the site).

The flexibility that Excel offered some (not-too-technical) power users in those circumstances was fantastic, though, as much as I hate to admit it.


Yep, it's got it's limits - I would only use this in production with carefully crafted queries to try too counter this problem.

My usual use is against MS CRM data, which puts a 10k line limit on by default. First instinct for many people is to disable the limit, but in reality it's a good fail safe!


By the way: in MS SQL Server, one can also do the reverse: use Excel spreadsheets as the backing store for SQL Server queries (http://msdn.microsoft.com/en-us/library/ms141683.aspx).

It is not easy to set up, but way easier than using csv for getting raw data from Excel into SQL Server, or for running a clean-up query on data in Excel.


You sir, are a godsend. This is wonderful. Thank you.


Is it read only or can it be read/write?


Depends on the data source, I believe. I think some can be RW however I never use it - allowing users to update data from Excel isn't something I'm keen on, other than possibly as a sysadmin task.


For a quick and dirty approach you can use .iqy query files.

It's a really basic format, just a text file with a URL to open and options defined by embedding arrays in the params string. e.g. POST Example.com/get_data?first_param=["parameter name","parameter description"] And then a few variables describing the format of the data to return - it'll parse an HTML table and csv's/tab separated values but that's about it, but it's better than the copy paste loop hell.

You can then add the query file as an external data source and pass parameters to queries to a web interface through excel - cell values or user inputs. It's cross platform back to at least osx office 2003. They're a little clunky, but pretty painless to setup and the only mac compatible system I've found. Excel<->Web data integration is something I've been weirdly interested in for awhile - if anyone has questions feel free to email me (its in hn profile)

Shameless plug: I wrote excel-data, a ruby gem that generates iqy files for excel, formats ruby hashes in a format it can interpret, and parses excels bizarre date format. It's pretty basic and my first OSS project, but gives you an idea of the format.


I don't remember exactly how, but yeah, you can totally do that. Set up your DB as a data source or some such.


You can add many different data sources (any ODBC driver), access, sharepoint sites, etc. Its a pretty useful tool for people who are not programmers and rely on PivotTables and such.

Its surprising how effective excel can be for non-programmers.


Made sense until he was supposed to get a Pomeranian in the last panel.


This is probably just a cultural misunderstanding. In many European corporations, it's common for dogs to be given as awards for exceptional work. Pomeranians are, of course, the highest honor to receive.


Could you care to clarify which countries in Europe do that? Never heard of that before. Just to clarify something: the OP appears to be Rory Blyth, ex-Microsoft employee, living in Portland, OR, and American if I'm not mistaken.


(mistercow was joking.)


Haha somewhat ironic that this post is on the front page at the same time as the story about ZeroCater which was started by a guy with no programming experience, using Excel :)


As of Excel 2013 it has more database like capabilities (relationships between tables, combine multiple data sources).

http://office.microsoft.com/en-us/excel-help/powerpivot-powe...


I've always thought that a spreadsheet/db hybrid would be an interesting idea for a startup, but surprisingly none of them ever took off:

DabbleDB had a great UI http://www.eweek.com/c/a/Database/Dabble-DB-Bridges-Database...

http://www.ragic.com https://www.zoho.com/creator

something tells me that the incentives for paying for this kind of software aren't aligned with the management. it's just easier to push crap data down to "IT guys" to fix then look for a real solution. would love to be proven wrong.


I appreciate the humour in this. This is currently my situation in life. Old businesses are horrible when it comes to managing data.

If it wasn't set up for it in the design or a bit further modified later on after realizing how rediculous it may be to change anything then it has failed beyond miserably.

This is the real reason behind the crazy huge amount of data formats there are: xml, obj dump, json etc

And people always think lets fix it by adding a new one!! In general xml and json are ones that made sense and the ones we keep continuing to make aren't really improving on much.


This reminds me of an obviously perfectly logical request a coworker came to me with once.

Even though our CRM automatically emailed our customers a copy of their invoice once their order was taken, she wanted to print out the invoice, scan it back to her PC as a PDF, then email that PDF to the client "just to make sure." Nevermind the fact that the CRM had a mechanism specifically designed for manually emailing an invoice to the customer already built in in the event that they didn't get the invoice the first time.


My last job featured a database written by me, which had to be done because there was no way to pull proper data out of the company database.

The problem was that there was several people over several years who used the company database and they all had their own silly opinions about how to update and enter information. There was a few things that were firmly set by the system. For example, once you entered any information on the UI, the letters auto-capped so the data had some integrity.

The biggest problem was the product entries. The logical way to create the UI would be:

SKU:

Color:

Product Name:

etc.

This is how it actually was:

SKU:

Product Name:

And whilst there was a color field, it was buried so deeply in the morass of entry fields that I didn't even know they were there for the first 6 months I worked there.

So, the data was entered like this:

123-BLUE

123-RED

123-BLACK

123-BLK

123-BLU

etc....

I come in and am asked to compute sales history by SKU, Color, Product Type, and all sorts of ugly things. As you can see, there was no way to even get things by SKU, much less by any other important metric. It didn't help that it was all GUI and no one had access to the raw SQL.

This was the time that Excel came to the rescue in spades. I exported everything off the database, pushed everything into CSV and hand-separated all of the data into something that I could use, then pushed it into Postgres on my machine.

Moral of the story for me: Yeah, Excel isn't the best thing, but a poorly designed database system made by a half-baked company with 15 years of rot creates a nightmare situation as well, and at least in this experience, having the ability to examine the raw data, do find-replaces in wordpad, and other odd ends like that was the only thing that saved me and my job.

After I had left, I was asked to return for a week to encode my job to the best of my ability. There was just no way to ensure that the uploads had the correct integrity considering the size and time-constraints of the project.

I knew it wouldn't go well because my replacement was partly tasked with helping me clean the data for the next importation, which I tasked her to do since she kindly deleted the entire database that I spent the prior year building up. I told her that x, y, z, etc, had to be done manually and she couldn't understand the simple concept that you cannot navigate an Excel document with a mouse, so she kept handing me broken files and blowing up the imports.

I'm sure you can figure out what happened next: first time she uses it, despite stupid-simple step-by-step instructions, she destroyed the database and I sat at home tearing my hair out on how to import all the data back in. I really didn't have time to get this part perfect. I just docked it as a non-feature I knew I'd regret later and hoped against hope.

Companies should never ever hire people to do anything semi-technical that has anything to do with tech.


Why were they able to destroy your database repeatedly? Where there no backups? I can imagine they might not understand, but wouldn't you have sneaked some backups in a corner some where?


Yeah, I had some CSVs backed up in a dark corner they couldn't touch the first time, but I didn't have a full dump because, honestly, I thought they'd never use it again, and it was highly unlikely they'd ever find someone who can do SQL after I left.

The second time was some messy issue with encoding so the backups nor CSVs wouldn't load in. After trying absolutely everything I could find, I just build a script that wrapped all the backed up data in INSERT clauses and sent a raw .sql file over. I couldn't physically go in at the time, so I was trying to be phone support on my own trashy product.


The most important tool is the one that people use.


I'm probably guilty of doing the opposite, using a database when a spreadsheet might have been more appropriate.


If you industry folks knew the stuff we see in scientific software in academia...

It's scary. I complain about this all the time.


Handling bad and dirty values from business users is the bread and butter of the data cleansing industry. Cleaning user inputs is standard, if you expected to build a database application without having to properly clean user submitted records, you have sent yourself on a fools errand.


Unfortunately, the vast amounts of medical research data is collected and stored in Excel "databases". Yes, you heard that right - Excel. The very studies that we use to inform clinical practice that we do when you get sick and come into hospitals are pretty summed up by that comic.


I have a friend who works on some of those medical research studies. They are required to use Excel. It is a giant pain because it is difficult to share the spreadsheet, there's constantly different versions, and data corruption issues.

All of the staff want to switch to an Access database that they can all access within the office (it would not be a complicated DB and they happen to know how to use Access).

But a senior manager on the project believes that the study board only approved Excel, when the study was originally approved years ago, so, therefore, they cannot now use Access.

This costs them tens of hours per week across the team.


Back in the early 90's I was doing medical research and they used Access and/or Excel in almost every study I was involved in. The amount of time spent cleaning up the data before it could be used was insane. I tried to get them to use a real database but it is hard to teach old dogs new tricks. Sounds like things have not changed in over 20 years.


This comic was featured in the volume "Best Software Writing vol. 1" by Joel Spolsky (of Joel on Software/Stack Overflow/bizarro language that compiles to PHP and VB fame), which is a nice collection of developer writing.


I've been working on this. It converts excel spreadsheets to a sqlite file. It can then serve the sqlite file as a flask app.

https://github.com/lucidguppy/conjuring


If you think this is limited to non-technical users, think again. Upon request for the zone file of some domain, some IT department of major telco sent me a screenshot of their windows-based DNS maintenance tool...


Excel is a great flat file database.

It's also a great way of formatting a small amount of data for human consumption.

Its not a database in the modern sense of the word, nor should be used as such.


I'd managed to repress the nightmares that came from living through scenarios just like those depicted in the comic. I guess I'll just stay up all night tonight!


This problem and its implication to the finance sector were the genesis of our startup: Addepar.


We're wrangling with this in our organization. I looked at your site and there's no pricing information.


You can reach me via email in my profile if you need details about Addepar.


Why is the author using grave accents as apostrophes?


Because that's his handwriting. He is not a font.


At my previous job (a .edu), our "enterprise" systems still ran on the trusty, decades old mainframe. Batch jobs typically ran once daily and were written in JCL and getting data out required writing reports in RPG.

Until ~5 years ago, we had a staff member whose job duties involved nothing but using RJE to submit these "report jobs", wait for them to finish, print them off on the huge IBM printers, and deliver them to the appropriate people (who, the majority of the time, never even looked at them).

Coincidentally, it was our marketing department who came up with ideas they wanted to pursue but were unable to due to a "lack of data". The data they needed was actually there but getting it in a form they could work with was the issue. My boss, a man who, it seemed, was incapable of saying "no" to requests, told them we'd come up with something. Thus, the idea for the "Mother Of All Databases" (MOADB) was born.

Fast forward a few months and several meetings with other departments and we had a vision of what this database would look like. My boss, despite having never written a SQL query in his life, had decided to take it upon himself to "design the database". Because I knew the most about databases (which wasn't saying much), I became intimately involved in this project (against my better judgment and recommendations that we ditch the project).

What we ended up with was a database shared amongst tens of users across several different departments. This database consisted of, ultimately, dozens of tables (some stored in Access, some in MS SQL Server), and an MS Access front-end. The boss designed "forms" in Access for the end users to interface with. Remember how I said my boss had never written a SQL query in his life? Well, he also had no idea what normalization is. Several of the tables in the database had in excess of 100 columns (the record, last I remember checking, was 173 columns).

Because I had looked at a report written in RPG once, I was drafted to write a massive report that would pull all of this data out of our mainframe system. The resulting "data files" were CSV files that we had to get into the database. We hit some limit of the system and had to split the job up so that it would only run on a semester's worth of data at a time. The aforementioned staff member would run these jobs every morning, wait for them to finish, and download the data files.

To get the data into the database, CGI scripts were written. The staff member would navigate to a specific URL where he'd upload the CSV data files to the CGI scripts, which would parse the CSV files, and insert/replace the data in the database. An e-mail would then go out to the users of the database, letting them know that the "daily updates" had ran.

This database was ultimately used by many different departments for a variety of tasks: marketing, retention, financial aid, advising, testing, etc. It grew and grew and results to add new data to the database continued to come in (you know how I said my boss didn't like to say "no" to anyone? eventually this changed.). It contained records on tens of thousands of students (including lots of data that shouldn't be laying around in .mdb files on file servers).

Although it was being "phased out" as we had finally upgraded from our old mainframe-based system, the database still existed and was being used whenever I left about two years ago.

(Related note: we dealt with, on an almost daily basis, requests from users for "Excel databases". This comic really strikes a nerve with me!)

Edit: I forgot a step... Staff member would download data (CSV) files, open them in Excel, had to do something to them (can't remember what), save them as .xls files, and upload them to the CGI scripts (Perl, using Spreadsheet::ParseExcel, if memory serves).


nice comic




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: