Arel, a composable relational algebra for Ruby

xal · on Jan 27, 2010

It's important to point out that Rails3 is build on top of this. It was started as a Google Summer of Code project. It's an amazing piece of engineering.

dmnd · on Jan 27, 2010

So this is Linq for Ruby. Using this in place of ActiveRecord's querying API will be great.

One nice thing about Linq though is that most (all?) of the methods on IEnumerable have the same semantics as those on IQueryable. Ruby has a pretty nice Enumerable, but the method names in Arel don't match up. Map becomes project, select is where, etc.

It's convenient to be able to use the same syntax to query an in memory object collection that you use to query a database (or any other kind of abstract collection).

davidmathers · on Jan 27, 2010

Map and project aren't the same thing. Map can't be part of the algebra because it breaks the closure property. Project is a single, specific function.

I agree with your sentiment though. It looks like Nick took names from SQL where available ("where") and from relational algebra otherwise ("project"). I would rather the names came from ruby first.

dmnd · on Jan 27, 2010

I don't know much (anything) about relational algebra or breaking closure properties, so I'm glad my sentiment was clear despite that error.

Is project different from map because it can only output a subset of the attributes that go in, whereas an arbitrary thing can come out of map?

davidmathers · on Jan 28, 2010

Exactly. Project is just an operator that takes a table and gives you a table with a different number of attributes. All the relational operations take and return table values (aka relation values), which is what makes the algebra closed under operations.

SQL select actually combines project and map (with select...as) by letting you define maps for the individual attributes of the table. That's fine since you can't break the table structure. You just can't map on tables or table rows.

pvg · on Jan 27, 2010

It looks interesting and clever (and a practical way to learn about relational algebra) but the reason similar interesting and clever solutions haven't become very popular over the years is that the field of applications where they have a significant advantage is very narrow. Often these are rapid-prototyping development tools and places where there's a need to generate very complex queries in a way compatible with different relational sources. Typically, though, the queries are not that complex and don't need to be generated dynamically so this sort of thing becomes just an exquisitely designed layer for more bugs to hide in.

fizx · on Jan 27, 2010

On the contrary, having a real data structure that represents the query you want your ORM to execute removes potential bugs! The alternative/old method of generating SQL was a bunch of kludgy string concatenation.

pvg · on Jan 27, 2010

There's also the old/current method of writing SQL by hand. And you're going to have a hard time convincing me that adding another layer of code to anything usually removes bugs. More code = more bugs. Not always, but just about.

raganwald · on Jan 27, 2010

I wonder if that depends on your world view? If you are "thinking in SQL," adding a layer of abstraction forces you to think of SQL and then solve the puzzle of "Which Arel incantation produces the desired result?"

Whereas if you find a way to think in algebra, Arel implements the abstraction and removes the problem of "Which SQL incantation produces the desired result?"

You're right every abstraction introduces some problems. It's can be a win if the abstraction's mental model is congruent to your thinking or to the problem space.

pvg · on Jan 27, 2010

I think it's it's really more a matter of experience than a worldview. In a previous life, I used to work on a (commercial) product with similar capabilities. I'm also not the sort of database superhero that can just spew optimal SQL effortlessly. In fact, I do tend to think at a level closer to relational algebra.

I've just come to find such abstraction layers less useful in the typical web app. They seem to be more applicable in an enterprise app setting where portability is important, control over the schema might be limited or non-existent, the performance and scalability requirements are more predictable. In a web app, the structural complexity of the data is often low or the data is not well-representable in relational form (note the rise in popularity of non-relational stores). Performance requirements are harder to predict. Adding an relational algebra abstraction layer in that context often doesn't add enough value to justify the increase in complexity - you still end up having to understand the entire depth of the stack while getting little benefit from the extra capabilities it offers beyond the warm fuzzy feeling of using something pretty neat.

mst · on Jan 27, 2010

Shame they appear to have confused how GROUP BY works. They're correct that -

SELECT users.*, count(photos.id) FROM users LEFT OUTER JOIN photos ON users.id = photos.user_id GROUP BY photos.user_id

won't do what they want. In fact, it isn't even a valid query outside of MySQL that I can think of.

What -will- do the right thing is:

SELECT users.id, users.name, count(photos.id) FROM users LEFT OUTER JOIN photos ON users.id = photos.user_id GROUP BY users.id, users.name

(which in the conceptually similar perl thing I'm working on at the moment you'd represent just by asking for a set of user objects with $user->photos->count eager loaded, but anyway ...)

raganwald · on Jan 27, 2010

Avdi Grimm also pointed me to: http://github.com/dkubb/veritas