Am I weird for preferring raw SQL over query abstractions like these ORMs expose? IMHO SQL is a perfectly fine language. I do however like letting a library do all the dirty work of turning SQL query output into objects. There's a good talk from this year's PyCon that shows SQLAlchemy's core library and how it gives a nice closer to raw SQL interface to a database: https://www.youtube.com/watch?v=0PSdzUxRYpA
It's an interesting question but I want to make a meta point.
I came here to read people's thoughts on the article but the top comment shifts the debate onto a wider topic.
This happens all the time on HN and is something I find slightly irritating if I'm actually interested in the topic itself. It's like that person at a dinner party who always changes the subject onto what THEY want to talk about.
I'm not blaming you - it's the fact that your post is the top one and the one most commented on that derails the original topic - which is a group decision.
It's not democracy, the top comment isn't the "only" comment as if the will of the majority chose it and that's all that exists in this thread, effectively suppressing minority influence. You have options to view other comments or start your own discussion if you please.
I think your point of contention is that other people, reading this very same article, approve of a comment that did the exact same thing you did: made a point about a "meta"-topic by forking the conversation into some other concept that was assumed when both of you were compelled to comment. Now we have three (your parent, you, and me) off-topic forks of a conversation that is superfluous to the so-called ubiquitous topic of the thread.
I see where you are coming from, but I find these "meta"-topics at times more interesting than the topic itself, hence why I'm commenting.
Interests are interests, your response is just as much justified IMO as your parent commenter.
So find, upvote, and participate in — or, if necessary, create — a thread that offers the kind of discussion you're hoping to see. Coming into a thread that's not the discussion you wanted, and being all, "But guys! We should be talking about $other_thing!" is an even more annoying dinner party guest, IMO.
In my defence - the $other_thing is the actual article though.
It's also that there are some predictable topics. Every post about a new Google product has someone bringing up the Reader shutdown and every post about ORMs has someone talking about how much they prefer raw SQL...
Unfortunately, a problem is that ORMs are goofy: taking a decent model (relational) and mutilating it into a mis-designed one (mainstream OOP). Because of the split between the programming vs. database worlds, where the programmers have the worse solution.
No, you are not weird. Yours is exactly my approach: Hand written queries with help from libraries for writing and, especially for retrieving data from the query back into the OO world.
My interpretation is that the relational and OO models are incompatible. Not 100% incompatible, but with enough dissimilarity to make writing a complete and efficient ORM a near impossible task.
Every ORM can easily do the simplest of mappings. Classes to tables, rows to instances is the very basic stuff. It is when you have complex data structures that the task of the ORM is close to impossible. Either you have a DSL for describing this mapping, or you fall back into the general language the rest of the program is written in.
I see little value in a DSL for describing relational to OO mappings, so I write the mapping in the general programming language already in use, hence I do not find any added value in ORMs.
This is the basis for iBATIS in the Java world. You provide the custom SQL and iBATIS knows how to wrap the results up into objects (as long as you provide the mapping).
So you're happy to generate the SQL via some language-specific API, and you're happy to have the results turned into objects...
What's the problem with an ORM again?
Half the point of SQLA is that the ORM is built on the core SQL generation. It's the least magical and most RDBMS-friendly ORM I've ever seen. It's designed for people who already grok SQL, not people who'd like to forget that it exists. You can even jam hand-written SQL into the ORM if you so please.
How do you do that without massively violating DRY? I've done tiny little personal projects where I use raw SQL, then eventually I find myself storing parts of queries into variables and concatenating them later, and it doesn't take long to realize I should just use a decent ORM.
It depends on the project. Quite often there simply isn't that much repetition.
On the flipside, how do you feel about slower performance? About diagnosing problems through multiple abstraction layers? Or about having different abstraction layers for different languages?
Stored procedures are great for solving some of that pain. For anything really serious you might want to go all the way and only use stored procedures so there's no ability/risk of your app code ever making raw SQL calls.
Stored procedures are a fucking pain. Now you've managed to split your logic in two completely different locations, and you get to enjoy rolling out new versions of the procedures in their own migration. Not to mention you're tied to your database.
On the other hand, ORMs can be a huge source of inefficiency if you don't look at the generated SQL and use lazy collection thoughtlessly, though they avoid a lot of boilerplate when saving a complex object graph.
A happy(?) medium is probably something with a powerful data retrieval API, which makes for composable queries (SQL query strings compose very badly).
In some environments, having the logic in the database is actually the better choice. When you are dealing with data that is in the million plus rows, there is more to lose from the separation of logic than just letting the database do what its good at while adding your logic.
All these discussions of which is better (ORM or straight SQL) should be put in context of the type of application, the amount of data and the complexity of the schema.
I have worked with systems which constrain all logic to the application server and when the data set is small it works fine, when its not, its a nightmare. Why should you bring down a million rows down to your application layer just so you can update a couple of columns. Doing this in the database as a set operation becomes trivial and fast.
No doubt you occasionally need to push logic down to the database, much in the same way that you sometimes need to uglify your code to make it fast enough, but in my experience, it's a good way to get a painful-to-manage codebase full of fantastic ideas such as "let's build this complex XML structure by concatenating strings".
Can you explain? I have no problem and have used or written my own SQL composition libraries pretty easily. It is code-composition, no doubt, but the BNF for SQL is very easy to grok even given the variations in platform.
Raw SQL is terrible when you need something a bit complicated. Want to do a complex search? Unfortunately, your coworker forgot an " AND " in a little-used function, which is going to blow up on you at runtime. Or maybe you didn't count the placeholders right and you inserted the wrong numbers values. Can you make that work for you with a library? Sure. But on its own, it's just bad.
On the other hand, SQLAlchemy's query API is quite nice, and composes just fine.
I personally share your tendencies. But what I'd say is that SQL Alchemy provides a great substrate API for doing exactly what you are talking about.
In those cases where you need to compose SQL, it can save you a ton of time with a SQL composition API (rather than building your own strings).
That is the basis of how SQL Alchemy builds up its ORM functionality. So if you are working on a project where you can save some time using the ORM, but prefer to work closer to SQL, you've got a great mix.
I prefer SQL as well. My theory is that it's because I started as a developer early enough that I had no choice but to learn SQL very well, and since I did, there's no motivation for me to add extra layers of abstraction/indirection to my code and deal with all the costs.
Personally, it depends. For doing anything somewhat complicated I usually prefer raw SQL - more compact, generally far more flexible, and I can trivially do dangerous crap like string interpolation. Translating the end result to objects is usually pretty simple. With a few exceptions, this has been for specific performance boosting hacks in bottlenecks, and site-wide statistics displays.
For anything simple, I vastly prefer ORMs. Joins and single selects are incredibly verbose and error-prone compared to "user.books" and "user.address" or the equivalent, and the added layer of abstraction means it's easier to modify the table without modifying code.
There's a tipping point. Eventually, you're likely to end up with half of a poorly-specified, bug-ridden implementation of an ORM if you don't use one at the start.
However, sometimes it's much simpler to just write custom SQL.
I'd say it's exactly the other way around: Eventually you end up being constrained by whatever ORM you're using, and you'll wish you had just used SQL from the beginning.
I will say, though, that SQLAlchemy is very nice for an ORM.