-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use associated types for SelectableExpression
#709
Conversation
While working on #621, I noticed that these impls were incorrect and could be used to compile an incorrect query. I've corrected the impls and added the appropriate compile-fail test. I'm not sure if this was just an oversight or if I intentionally did this to avoid nullability somewhere. The latter is no longer relevant since we always make these expressions nullable now.
The phantom data was just plain unneccessary. It was either an oversight or a bug in older rust versions where `SqlType=Array<ST>` didn't count as the type being constrained. The `HasSqlType` constraints are sufficiently covered elsewhere (and frankly, I'm fairly certain that trait is useless and can be removed). It should be noted that we don't have a compile-test covering that case though, as pg is the only backend with additional types.
The `SelectableExpression` trait serves two purposes for us. The first and most important role it fills is to ensure that columns from tables that aren't in the from clause cannot be used. The second way that we use it to make columns which are on the right side of a left outer join be nullable. There were two reasons that we used a type parameter instead of an associated type. The first was to make it so that `(Nullable<X>, Nullable<Y>)` could be treated as `Nullable<(X, Y)>`. We did this because the return type of `users.left_outer_join(posts)` should be `(User, Option<Post>)`, not `(User, Post)` where every field of `Post` is an `Option`. Since we now provide a `.nullable()` method in the core DSL, I think we can simply require calling that method explicitly if you want that tuple conversion to occur. I think that the most common time that conversion will even be used is when the default select clause is used, where we can just handle it for our users automatically. The other reason that we went with a type parameter originally was that it was easier, since we can provide a default value for a type parameter but not an associated type. This turned out to actually be a drawback, as it led to #104. This PR actually brings back aspects of that issue, which I'll get to in a moment. It's expected that any expression which implements `SelectableExpression<QS>` have a `T: SelectableExpression<QS>` bound for each of its parts. The problem is, the missing second parameter is defaulting to `T::SqlType`, which means we are implicitly saying that this bound only applies for `QS` which does not change the SQL type (anything except a left outer join). This ultimately led to #621. However, with our current structure, it is impossible to fix #621 without re-introducing at least some aspects of #104. In #104 (comment) I said that we didn't need to worry about `1 + NULL`, because we didn't implement add for any nullable types. However, I'm not sure I considered joins when I made that statement. The statement applied to joins previously because of that implicit "sql type doesn't change" constraint. This commit removes that constraint, meaning #104 will be back at least when the nullability comes from being on the right side of a left join. I don't think this is a serious enough issue that we need to immediately address it, as the types of queries which would cause the issue still just don't happen in practice. We should come up with a long term plan for it, though. Ultimately the nullability of a field really only matters in the select clause. Since any operation on null returns null, and you basically want null to act as false in the where clasue, it doesn't matter there. So one partial step we could take is to break this out into two separate traits. One for the "make sure this is valid given the from clause", and one for the "make this nullable sometimes" case and only constrain on the first one in the where clause. We could then re-add the "sql type doesn't change" constraint on the problem cases, which will bring back aspects of #621, but only for select clauses which is a smaller problem. I'm not sure if I ultimately want to go the two traits route or not. If nothing else, the problem cases are much more obvious with this commit. Anywhere that has `type SqlTypeForSelect = Self::SqlType` is likely a problem case when joins are involved. This will make it easier to find all the places to apply a solution when I come up with one that I'm happy with. Fixes #621.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, there's a lot of stuff in here. Changing SelectableExpression
is a lot of churn, and a lot of compromises, but your reasoning is sound. Changing this now and having #621 while knowing the drawbacks (I trust you when you say these cases are rare) sounds good. By the way, do you think it'd make sense to actually add tests to document/assert the expected, broken behavior?
/// Indicates that an expression can be selected from a source. The associated | ||
/// type is usually the same as `Expression::SqlType`, but is used to indicate | ||
/// that a column is always nullable when it appears on the right side of a left | ||
/// outer join, even if it wasn't nullable to begin with. | ||
/// | ||
/// Columns will implement this for their table. Certain special types, like | ||
/// `CountStar` and `Bound` will implement this for all sources. All other | ||
/// expressions will inherit this from their children. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inherit this from their children
Only noticed this now. Nicely put, will make people used to classical OOP with inheritance nervous, though 😄
let source = users::table.select(max(posts::id)); | ||
//~^ ERROR E0277 | ||
let source = users::table.select(min(posts::id)); | ||
//~^ ERROR E0277 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add let source = users::table.select(sum(user::id));
(with no error) to prevent false positives.
@@ -20,9 +20,6 @@ impl<'a, Expr> Aliased<'a, Expr> { | |||
} | |||
} | |||
|
|||
#[derive(Debug, Copy, Clone)] | |||
pub struct FromEverywhere; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, was this ever used for anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be, yeah. I don't know when it stopped being used.
@@ -57,7 +54,9 @@ impl<'a, T> QueryId for Aliased<'a, T> { | |||
// FIXME This is incorrect, should only be selectable from WithQuerySource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still true.
("Sean".to_string(), Some("Hello".to_string())), | ||
]; | ||
let source = users::table.left_outer_join(posts::table) | ||
.select((users::name, posts::title)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
An example broken case is |
The change in #709 had the side effect of re-introducing #104. With the design that we have right now, nullability isn't propagating upwards. This puts the issue of "expressions aren't validating that the type of its arguments haven't become nullable, and thus nulls are slipping in where they shouldn't be" at odds with "we can't use complex expressions in filters for joins because the SQL type changed". This semi-resolves the issue by restricting when we care about nullability. Ultimately the only time it really matters is when we're selecting data, as we need to enforce that the result goes into an `Option`. For places where we don't see the bytes in Rust (filter, order, etc), `NULL` is effectively `false`. This change goes back to fully fixing #104, but brings back a small piece of #621. I've changed everything that is a composite expression to only be selectable if the SQL type hasn't changed. This means that you won't be able to do things like `users.left_outer_join(posts).select(posts::id + 1)`, but you will be able to use whatever you want in `filter`. This change is also to support what I think will fix the root of all these issues. The design of "Here's the SQL type on this query source" is just fundamentally not what we need. There is only one case where the type changes, and that is to become null when it is on the right side of a left join, the left side of a right join, or either side of a full join. One of the changes that #709 made was to require that you explicitly call `.nullable()` on a tuple if you wanted to get `Option<(i32, String)>` instead of `(Option<i32>, Option<String>)`. This has worked out fine, and isn't a major ergonomic pain. The common case is just to use the default select clause anyway. So I want to go further down this path. The longer term plan is to remove `SqlTypeForSelect` entirely, and *not* implement `SelectableExpression` for columns on the nullable side of a join. We will then provide these two blanket impls: ```rust impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<T> where T: SelectableExpression<Right>, {} impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<Cons<Head, Tail>> where Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>, Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>, {} ``` (Note: Those impls overlap. Providing them as blanket impls would require rust-lang/rust#40097. Providing them as non-blanket impls would require us to mark `Nullable` and possibly `Cons` as `#[fundamental]`) The end result will be that nullability naturally propagates as we want it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing `select(lower(posts::name).nullable())` will work. `lower(posts::name)` will fail because `posts::name` doesn't impl `SelectableExpression`. `lower(posts::name.nullable())` will fail because while `SelectableExpression` will be met, the SQL type of the argument isn't what's expected. Putting `.nullable` at the very top level naturally follows SQL's semantics here.
The change in #709 had the side effect of re-introducing #104. With the design that we have right now, nullability isn't propagating upwards. This puts the issue of "expressions aren't validating that the type of its arguments haven't become nullable, and thus nulls are slipping in where they shouldn't be" at odds with "we can't use complex expressions in filters for joins because the SQL type changed". This semi-resolves the issue by restricting when we care about nullability. Ultimately the only time it really matters is when we're selecting data, as we need to enforce that the result goes into an `Option`. For places where we don't see the bytes in Rust (filter, order, etc), `NULL` is effectively `false`. This change goes back to fully fixing #104, but brings back a small piece of #621. I've changed everything that is a composite expression to only be selectable if the SQL type hasn't changed. This means that you won't be able to do things like `users.left_outer_join(posts).select(posts::id + 1)`, but you will be able to use whatever you want in `filter`. This change is also to support what I think will fix the root of all these issues. The design of "Here's the SQL type on this query source" is just fundamentally not what we need. There is only one case where the type changes, and that is to become null when it is on the right side of a left join, the left side of a right join, or either side of a full join. One of the changes that #709 made was to require that you explicitly call `.nullable()` on a tuple if you wanted to get `Option<(i32, String)>` instead of `(Option<i32>, Option<String>)`. This has worked out fine, and isn't a major ergonomic pain. The common case is just to use the default select clause anyway. So I want to go further down this path. The longer term plan is to remove `SqlTypeForSelect` entirely, and *not* implement `SelectableExpression` for columns on the nullable side of a join. We will then provide these two blanket impls: ```rust impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<T> where T: SelectableExpression<Right>, {} impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<Cons<Head, Tail>> where Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>, Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>, {} ``` (Note: Those impls overlap. Providing them as blanket impls would require rust-lang/rust#40097. Providing them as non-blanket impls would require us to mark `Nullable` and possibly `Cons` as `#[fundamental]`) The end result will be that nullability naturally propagates as we want it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing `select(lower(posts::name).nullable())` will work. `lower(posts::name)` will fail because `posts::name` doesn't impl `SelectableExpression`. `lower(posts::name.nullable())` will fail because while `SelectableExpression` will be met, the SQL type of the argument isn't what's expected. Putting `.nullable` at the very top level naturally follows SQL's semantics here.
The change in #709 had the side effect of re-introducing #104. With the design that we have right now, nullability isn't propagating upwards. This puts the issue of "expressions aren't validating that the type of its arguments haven't become nullable, and thus nulls are slipping in where they shouldn't be" at odds with "we can't use complex expressions in filters for joins because the SQL type changed". This semi-resolves the issue by restricting when we care about nullability. Ultimately the only time it really matters is when we're selecting data, as we need to enforce that the result goes into an `Option`. For places where we don't see the bytes in Rust (filter, order, etc), `NULL` is effectively `false`. This change goes back to fully fixing #104, but brings back a small piece of #621. I've changed everything that is a composite expression to only be selectable if the SQL type hasn't changed. This means that you won't be able to do things like `users.left_outer_join(posts).select(posts::id + 1)`, but you will be able to use whatever you want in `filter`. This change is also to support what I think will fix the root of all these issues. The design of "Here's the SQL type on this query source" is just fundamentally not what we need. There is only one case where the type changes, and that is to become null when it is on the right side of a left join, the left side of a right join, or either side of a full join. One of the changes that #709 made was to require that you explicitly call `.nullable()` on a tuple if you wanted to get `Option<(i32, String)>` instead of `(Option<i32>, Option<String>)`. This has worked out fine, and isn't a major ergonomic pain. The common case is just to use the default select clause anyway. So I want to go further down this path. The longer term plan is to remove `SqlTypeForSelect` entirely, and *not* implement `SelectableExpression` for columns on the nullable side of a join. We will then provide these two blanket impls: ```rust impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<T> where T: SelectableExpression<Right>, {} impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>> for Nullable<Cons<Head, Tail>> where Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>, Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>, {} ``` (Note: Those impls overlap. Providing them as blanket impls would require rust-lang/rust#40097. Providing them as non-blanket impls would require us to mark `Nullable` and possibly `Cons` as `#[fundamental]`) The end result will be that nullability naturally propagates as we want it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing `select(lower(posts::name).nullable())` will work. `lower(posts::name)` will fail because `posts::name` doesn't impl `SelectableExpression`. `lower(posts::name.nullable())` will fail because while `SelectableExpression` will be met, the SQL type of the argument isn't what's expected. Putting `.nullable` at the very top level naturally follows SQL's semantics here.
Note: This PR rolls up commits from a few others. Review might be easier if it's done one commit at a time.
The
SelectableExpression
trait serves two purposes for us. The first and most important role it fills is to ensure that columns from tables that aren't in the from clause cannot be used. The second way that we use it to make columns which are on the right side of a left outer join be nullable.There were two reasons that we used a type parameter instead of an associated type. The first was to make it so that
(Nullable<X>, Nullable<Y>)
could be treated asNullable<(X, Y)>
. We did this because the return type ofusers.left_outer_join(posts)
should be(User, Option<Post>)
, not(User, Post)
where every field ofPost
is anOption
.Since we now provide a
.nullable()
method in the core DSL, I think we can simply require calling that method explicitly if you want that tuple conversion to occur. I think that the most common time that conversion will even be used is when the default select clause is used, where we can just handle it for our users automatically.The other reason that we went with a type parameter originally was that it was easier, since we can provide a default value for a type parameter but not an associated type. This turned out to actually be a drawback, as it led to #104. This PR actually brings back aspects of that issue, which I'll get to in a moment.
It's expected that any expression which implements
SelectableExpression<QS>
have aT: SelectableExpression<QS>
bound for each of its parts. The problem is, the missing second parameter is defaulting toT::SqlType
, which means we are implicitly saying that this bound only applies forQS
which does not change the SQL type (anything except a left outer join). This ultimately led to #621.However, with our current structure, it is impossible to fix #621 without re-introducing at least some aspects of #104. In #104 (comment) I said that we didn't need to worry about
1 + NULL
, because we didn't implement add for any nullable types. However, I'm not sure I considered joins when I made that statement. The statement applied to joins previously because of that implicit "sql type doesn't change" constraint. This commit removes that constraint, meaning #104 will be back at least when the nullability comes from being on the right side of a left join.I don't think this is a serious enough issue that we need to immediately address it, as the types of queries which would cause the issue still just don't happen in practice. We should come up with a long term plan for it, though. Ultimately the nullability of a field really only matters in the select clause. Since any operation on null returns null, and you basically want null to act as false in the where clasue, it doesn't matter there.
So one partial step we could take is to break this out into two separate traits. One for the "make sure this is valid given the from clause", and one for the "make this nullable sometimes" case and only constrain on the first one in the where clause. We could then re-add the "sql type doesn't change" constraint on the problem cases, which will bring back aspects of #621, but only for select clauses which is a smaller problem.
I'm not sure if I ultimately want to go the two traits route or not. If nothing else, the problem cases are much more obvious with this commit. Anywhere that has
type SqlTypeForSelect = Self::SqlType
is likely a problem case when joins are involved. This will make it easier to find all the places to apply a solution when I come up with one that I'm happy with.One last note about this is that it makes
BoxableExpression
that much less ergonomic to use, since now two associated types have to be specified instead of one. There's not much we can do about that (it might be helped by the two trait path), and I'm really not sure if that type is even useful since we now have the boxed queries DSL.Fixes #621.