Use associated types for `SelectableExpression` #709

sgrif · 2017-02-15T15:57:49Z

Note: This PR rolls up commits from a few others. Review might be easier if it's done one commit at a time.

The SelectableExpression trait serves two purposes for us. The first and most important role it fills is to ensure that columns from tables that aren't in the from clause cannot be used. The second way that we use it to make columns which are on the right side of a left outer join be nullable.

There were two reasons that we used a type parameter instead of an associated type. The first was to make it so that (Nullable<X>, Nullable<Y>) could be treated as Nullable<(X, Y)>. We did this because the return type of users.left_outer_join(posts) should be (User, Option<Post>), not (User, Post) where every field of Post is an Option.

Since we now provide a .nullable() method in the core DSL, I think we can simply require calling that method explicitly if you want that tuple conversion to occur. I think that the most common time that conversion will even be used is when the default select clause is used, where we can just handle it for our users automatically.

The other reason that we went with a type parameter originally was that it was easier, since we can provide a default value for a type parameter but not an associated type. This turned out to actually be a drawback, as it led to #104. This PR actually brings back aspects of that issue, which I'll get to in a moment.

It's expected that any expression which implements SelectableExpression<QS> have a T: SelectableExpression<QS> bound for each of its parts. The problem is, the missing second parameter is defaulting to T::SqlType, which means we are implicitly saying that this bound only applies for QS which does not change the SQL type (anything except a left outer join). This ultimately led to #621.

However, with our current structure, it is impossible to fix #621 without re-introducing at least some aspects of #104. In #104 (comment) I said that we didn't need to worry about 1 + NULL, because we didn't implement add for any nullable types. However, I'm not sure I considered joins when I made that statement. The statement applied to joins previously because of that implicit "sql type doesn't change" constraint. This commit removes that constraint, meaning #104 will be back at least when the nullability comes from being on the right side of a left join.

I don't think this is a serious enough issue that we need to immediately address it, as the types of queries which would cause the issue still just don't happen in practice. We should come up with a long term plan for it, though. Ultimately the nullability of a field really only matters in the select clause. Since any operation on null returns null, and you basically want null to act as false in the where clasue, it doesn't matter there.

So one partial step we could take is to break this out into two separate traits. One for the "make sure this is valid given the from clause", and one for the "make this nullable sometimes" case and only constrain on the first one in the where clause. We could then re-add the "sql type doesn't change" constraint on the problem cases, which will bring back aspects of #621, but only for select clauses which is a smaller problem.

I'm not sure if I ultimately want to go the two traits route or not. If nothing else, the problem cases are much more obvious with this commit. Anywhere that has type SqlTypeForSelect = Self::SqlType is likely a problem case when joins are involved. This will make it easier to find all the places to apply a solution when I come up with one that I'm happy with.

One last note about this is that it makes BoxableExpression that much less ergonomic to use, since now two associated types have to be specified instead of one. There's not much we can do about that (it might be helped by the two trait path), and I'm really not sure if that type is even useful since we now have the boxed queries DSL.

Fixes #621.

While working on #621, I noticed that these impls were incorrect and could be used to compile an incorrect query. I've corrected the impls and added the appropriate compile-fail test. I'm not sure if this was just an oversight or if I intentionally did this to avoid nullability somewhere. The latter is no longer relevant since we always make these expressions nullable now.

The phantom data was just plain unneccessary. It was either an oversight or a bug in older rust versions where `SqlType=Array<ST>` didn't count as the type being constrained. The `HasSqlType` constraints are sufficiently covered elsewhere (and frankly, I'm fairly certain that trait is useless and can be removed). It should be noted that we don't have a compile-test covering that case though, as pg is the only backend with additional types.

The `SelectableExpression` trait serves two purposes for us. The first and most important role it fills is to ensure that columns from tables that aren't in the from clause cannot be used. The second way that we use it to make columns which are on the right side of a left outer join be nullable. There were two reasons that we used a type parameter instead of an associated type. The first was to make it so that `(Nullable<X>, Nullable<Y>)` could be treated as `Nullable<(X, Y)>`. We did this because the return type of `users.left_outer_join(posts)` should be `(User, Option<Post>)`, not `(User, Post)` where every field of `Post` is an `Option`. Since we now provide a `.nullable()` method in the core DSL, I think we can simply require calling that method explicitly if you want that tuple conversion to occur. I think that the most common time that conversion will even be used is when the default select clause is used, where we can just handle it for our users automatically. The other reason that we went with a type parameter originally was that it was easier, since we can provide a default value for a type parameter but not an associated type. This turned out to actually be a drawback, as it led to #104. This PR actually brings back aspects of that issue, which I'll get to in a moment. It's expected that any expression which implements `SelectableExpression<QS>` have a `T: SelectableExpression<QS>` bound for each of its parts. The problem is, the missing second parameter is defaulting to `T::SqlType`, which means we are implicitly saying that this bound only applies for `QS` which does not change the SQL type (anything except a left outer join). This ultimately led to #621. However, with our current structure, it is impossible to fix #621 without re-introducing at least some aspects of #104. In #104 (comment) I said that we didn't need to worry about `1 + NULL`, because we didn't implement add for any nullable types. However, I'm not sure I considered joins when I made that statement. The statement applied to joins previously because of that implicit "sql type doesn't change" constraint. This commit removes that constraint, meaning #104 will be back at least when the nullability comes from being on the right side of a left join. I don't think this is a serious enough issue that we need to immediately address it, as the types of queries which would cause the issue still just don't happen in practice. We should come up with a long term plan for it, though. Ultimately the nullability of a field really only matters in the select clause. Since any operation on null returns null, and you basically want null to act as false in the where clasue, it doesn't matter there. So one partial step we could take is to break this out into two separate traits. One for the "make sure this is valid given the from clause", and one for the "make this nullable sometimes" case and only constrain on the first one in the where clause. We could then re-add the "sql type doesn't change" constraint on the problem cases, which will bring back aspects of #621, but only for select clauses which is a smaller problem. I'm not sure if I ultimately want to go the two traits route or not. If nothing else, the problem cases are much more obvious with this commit. Anywhere that has `type SqlTypeForSelect = Self::SqlType` is likely a problem case when joins are involved. This will make it easier to find all the places to apply a solution when I come up with one that I'm happy with. Fixes #621.

killercup

Wow, there's a lot of stuff in here. Changing SelectableExpression is a lot of churn, and a lot of compromises, but your reasoning is sound. Changing this now and having #621 while knowing the drawbacks (I trust you when you say these cases are rare) sounds good. By the way, do you think it'd make sense to actually add tests to document/assert the expected, broken behavior?

killercup · 2017-02-15T21:34:02Z

diesel/src/expression/mod.rs

+/// Indicates that an expression can be selected from a source. The associated
+/// type is usually the same as `Expression::SqlType`, but is used to indicate
+/// that a column is always nullable when it appears on the right side of a left
+/// outer join, even if it wasn't nullable to begin with.
 ///
 /// Columns will implement this for their table. Certain special types, like
 /// `CountStar` and `Bound` will implement this for all sources. All other
 /// expressions will inherit this from their children.


inherit this from their children

Only noticed this now. Nicely put, will make people used to classical OOP with inheritance nervous, though 😄

killercup · 2017-02-15T21:35:05Z

diesel_compile_tests/tests/compile-fail/aggregate_expression_requires_column_from_same_table.rs

+    let source = users::table.select(max(posts::id));
+    //~^ ERROR E0277
+    let source = users::table.select(min(posts::id));
+    //~^ ERROR E0277


I'd add let source = users::table.select(sum(user::id)); (with no error) to prevent false positives.

killercup · 2017-02-15T21:43:04Z

diesel/src/expression/aliased.rs

@@ -20,9 +20,6 @@ impl<'a, Expr> Aliased<'a, Expr> {
    }
 }

-#[derive(Debug, Copy, Clone)]
-pub struct FromEverywhere;


Huh, was this ever used for anything?

It used to be, yeah. I don't know when it stopped being used.

killercup · 2017-02-15T21:43:46Z

diesel/src/expression/aliased.rs

@@ -57,7 +54,9 @@ impl<'a, T> QueryId for Aliased<'a, T> {
 // FIXME This is incorrect, should only be selectable from WithQuerySource


Still true?

Still true.

killercup · 2017-02-15T22:00:36Z

diesel_tests/tests/joins.rs

+        ("Sean".to_string(), Some("Hello".to_string())),
+    ];
+    let source = users::table.left_outer_join(posts::table)
+        .select((users::name, posts::title))


sgrif · 2017-02-15T23:30:45Z

By the way, do you think it'd make sense to actually add tests to document/assert the expected, broken behavior?

An example broken case is users::table.left_outer_join(posts::table).select(users::id + posts::id). I'm not sure that it's worth adding a case to the repo for it.

The change in #709 had the side effect of re-introducing #104. With the design that we have right now, nullability isn't propagating upwards. This puts the issue of "expressions aren't validating that the type of its arguments haven't become nullable, and thus nulls are slipping in where they shouldn't be" at odds with "we can't use complex expressions in filters for joins because the SQL type changed". This semi-resolves the issue by restricting when we care about nullability. Ultimately the only time it really matters is when we're selecting data, as we need to enforce that the result goes into an `Option`. For places where we don't see the bytes in Rust (filter, order, etc), `NULL` is effectively `false`. This change goes back to fully fixing #104, but brings back a small piece of #621. I've changed everything that is a composite expression to only be selectable if the SQL type hasn't changed. This means that you won't be able to do things like `users.left_outer_join(posts).select(posts::id + 1)`, but you will be able to use whatever you want in `filter`. This change is also to support what I think will fix the root of all these issues. The design of "Here's the SQL type on this query source" is just fundamentally not what we need. There is only one case where the type changes, and that is to become null when it is on the right side of a left join, the left side of a right join, or either side of a full join. One of the changes that #709 made was to require that you explicitly call `.nullable()` on a tuple if you wanted to get `Option<(i32, String)>` instead of `(Option<i32>, Option<String>)`. This has worked out fine, and isn't a major ergonomic pain. The common case is just to use the default select clause anyway. So I want to go further down this path. The longer term plan is to remove `SqlTypeForSelect` entirely, and *not* implement `SelectableExpression` for columns on the nullable side of a join. We will then provide these two blanket impls: ```rust impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<T> where T: SelectableExpression<Right>, {} impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<Cons<Head, Tail>> where Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>, Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>, {} ``` (Note: Those impls overlap. Providing them as blanket impls would require rust-lang/rust#40097. Providing them as non-blanket impls would require us to mark `Nullable` and possibly `Cons` as `#[fundamental]`) The end result will be that nullability naturally propagates as we want it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing `select(lower(posts::name).nullable())` will work. `lower(posts::name)` will fail because `posts::name` doesn't impl `SelectableExpression`. `lower(posts::name.nullable())` will fail because while `SelectableExpression` will be met, the SQL type of the argument isn't what's expected. Putting `.nullable` at the very top level naturally follows SQL's semantics here.

sgrif added 3 commits February 15, 2017 08:32

sgrif requested a review from killercup February 15, 2017 15:57

This was referenced Feb 15, 2017

Remove unused bounds and PhantomData from any and all #707

Closed

Ensure aggregate functions enforce the column is from the right table #706

Closed

killercup approved these changes Feb 15, 2017

View reviewed changes

sgrif merged commit 37c82db into master Feb 16, 2017

sgrif deleted the sg-selectable-expression-associated-type branch February 16, 2017 17:55

sgrif mentioned this pull request Feb 26, 2017

Split SelectableExpression into two traits #764

Merged

sgrif mentioned this pull request Feb 26, 2017

Merge Expression with SelectableExpression #6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use associated types for `SelectableExpression` #709

Use associated types for `SelectableExpression` #709

sgrif commented Feb 15, 2017

killercup left a comment

killercup Feb 15, 2017

killercup Feb 15, 2017

killercup Feb 15, 2017

sgrif Feb 15, 2017

killercup Feb 15, 2017

sgrif Feb 15, 2017

killercup Feb 15, 2017

sgrif commented Feb 15, 2017

		@@ -57,7 +54,9 @@ impl<'a, T> QueryId for Aliased<'a, T> {
		// FIXME This is incorrect, should only be selectable from WithQuerySource

Use associated types for SelectableExpression #709

Use associated types for SelectableExpression #709

Conversation

sgrif commented Feb 15, 2017

killercup left a comment

Choose a reason for hiding this comment

killercup Feb 15, 2017

Choose a reason for hiding this comment

killercup Feb 15, 2017

Choose a reason for hiding this comment

killercup Feb 15, 2017

Choose a reason for hiding this comment

sgrif Feb 15, 2017

Choose a reason for hiding this comment

killercup Feb 15, 2017

Choose a reason for hiding this comment

sgrif Feb 15, 2017

Choose a reason for hiding this comment

killercup Feb 15, 2017

Choose a reason for hiding this comment

sgrif commented Feb 15, 2017

Use associated types for `SelectableExpression` #709

Use associated types for `SelectableExpression` #709