Hacker News new | past | comments | ask | show | jobs | submit login

"What do you think should happen if there are multiple foreign keys connecting the two tables? Should this also work for many-to-many relations with an intermediate table?"

If it's not ambiguous, then let me do it. If I rely on ambiguity then throw an exception. In the case of multiple foreign keys, throw an exception, as there's no way to know which one I mean. It'd be nice if I could disambiguate the situation though. Normal SQL allows the `on` clause.

  from TableA
  inner join TableB on <expression>
What if I could specify a foreign key constraint just as easily...

  from TableA
  inner join TableB by ConstraintC
Where ConstraintC is the name of a foreign key constraint between Table A and Table B. It'd be nice to specify the constraint without having to specify the column name details.

The same goes for the many to many relationship with an intermediate table. It could look something like this...

  from TableA
  inner join TableB through TableC
I wouldn't introduce TableC into the scope of the statement. It's not in the FROM clause. It's used in the query but is not available for selecting from. If you want to bring in columns from it, join on it the usual way.

As applications grow, and initially simple lookup table semantics get more nuanced, it might be nice to be able to constrain the join on the lookup table like this...

  from TableA
  inner join TableB through TableC where <expression>
That way if my TableC has some extra columns, such as effective dates, or deleted flags, or that sort of thing, then I can filter out some of the joins that might usually happen.



Unambiguous things can become ambiguous at later points. As soon as you add a second relation between the tables, what once was unambiguous now is, and because of something which may be entirely unrelated to the specifics of the original query.

This is where many conveniences that use implicit data run into problems. A small convenience now for the possibility of accidentally breaking because of mostly unrelated changes later is a poor trade off for anyone that wants to have stable and consistent software.

This is likely one of those cases where you're better off with tooling to help make writing the correct unambiguous code easier (or automated away) than introducing a feature which leads to less stable systems in some cases.

Edit: Along the lines of what you note at the end, I would rather see joins able to use named relations as defined in the schema. Of there's a relation from table movie to table actor specifically names roles in the schema, I would rather be able to join movie on roles and have actors joined correctly using that relation, and aliases to roles which I could then use. Then you're using features that are designed and stable and not implicit and subject to changing how or whether they function based on semi-unrelated changes.

That might look like: "from movie relate roles" which is equivalent to "from movie join actor roles on movie.id = roles.movie_id", but because actor.movie_id has a constraint in the schema named roles which restricts it to a movie.id already.


Agreed! But rather than making the query compiler infer join paths from the schema, wouldn't it make more sense to support defining common join paths (like your 'movie relate roles') in the language, and build a tool, that generates such definitions from the schema, as a separate step?

I don't have a specific syntax in mind yet; for illustrative purposes:

    defjoin r,m,a = %prejoin_roles() -> {     # define a common join path between three relations r,m,a:
      from r=ROLES                            # can hard-code table names or use parameters (which may refer to other parameters)
      join m=MOVIES [r.movie_id = m.movie_id]
      join a=ACTORS [r.actor_id = a.actor_id]
    }

    from r,m,a = %prejoin_roles()
    select m.title, a.character_name
This `defjoin` thing is a limited version of PRQL `table`, which -- unlike a CTE -- remembers which relation each attribute comes from. Perhaps one can instead figure out how to extend `table` to support this.


That sounds like the perfect solution!


One way to avoid constraint name collisions is to include the base table and foreign table names and keys in the constraint name separated by underscores, at which point you don’t save much by using the constraint in a join.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: