Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use associated types for SelectableExpression #709

Merged
merged 3 commits into from
Feb 16, 2017

Conversation

sgrif
Copy link
Member

@sgrif sgrif commented Feb 15, 2017

Note: This PR rolls up commits from a few others. Review might be easier if it's done one commit at a time.

The SelectableExpression trait serves two purposes for us. The first and most important role it fills is to ensure that columns from tables that aren't in the from clause cannot be used. The second way that we use it to make columns which are on the right side of a left outer join be nullable.

There were two reasons that we used a type parameter instead of an associated type. The first was to make it so that (Nullable<X>, Nullable<Y>) could be treated as Nullable<(X, Y)>. We did this because the return type of users.left_outer_join(posts) should be (User, Option<Post>), not (User, Post) where every field of Post is an Option.

Since we now provide a .nullable() method in the core DSL, I think we can simply require calling that method explicitly if you want that tuple conversion to occur. I think that the most common time that conversion will even be used is when the default select clause is used, where we can just handle it for our users automatically.

The other reason that we went with a type parameter originally was that it was easier, since we can provide a default value for a type parameter but not an associated type. This turned out to actually be a drawback, as it led to #104. This PR actually brings back aspects of that issue, which I'll get to in a moment.

It's expected that any expression which implements SelectableExpression<QS> have a T: SelectableExpression<QS> bound for each of its parts. The problem is, the missing second parameter is defaulting to T::SqlType, which means we are implicitly saying that this bound only applies for QS which does not change the SQL type (anything except a left outer join). This ultimately led to #621.

However, with our current structure, it is impossible to fix #621 without re-introducing at least some aspects of #104. In #104 (comment) I said that we didn't need to worry about 1 + NULL, because we didn't implement add for any nullable types. However, I'm not sure I considered joins when I made that statement. The statement applied to joins previously because of that implicit "sql type doesn't change" constraint. This commit removes that constraint, meaning #104 will be back at least when the nullability comes from being on the right side of a left join.

I don't think this is a serious enough issue that we need to immediately address it, as the types of queries which would cause the issue still just don't happen in practice. We should come up with a long term plan for it, though. Ultimately the nullability of a field really only matters in the select clause. Since any operation on null returns null, and you basically want null to act as false in the where clasue, it doesn't matter there.

So one partial step we could take is to break this out into two separate traits. One for the "make sure this is valid given the from clause", and one for the "make this nullable sometimes" case and only constrain on the first one in the where clause. We could then re-add the "sql type doesn't change" constraint on the problem cases, which will bring back aspects of #621, but only for select clauses which is a smaller problem.

I'm not sure if I ultimately want to go the two traits route or not. If nothing else, the problem cases are much more obvious with this commit. Anywhere that has type SqlTypeForSelect = Self::SqlType is likely a problem case when joins are involved. This will make it easier to find all the places to apply a solution when I come up with one that I'm happy with.

One last note about this is that it makes BoxableExpression that much less ergonomic to use, since now two associated types have to be specified instead of one. There's not much we can do about that (it might be helped by the two trait path), and I'm really not sure if that type is even useful since we now have the boxed queries DSL.

Fixes #621.

While working on #621, I noticed that these impls were incorrect and
could be used to compile an incorrect query. I've corrected the impls
and added the appropriate compile-fail test.

I'm not sure if this was just an oversight or if I intentionally did
this to avoid nullability somewhere. The latter is no longer relevant
since we always make these expressions nullable now.
The phantom data was just plain unneccessary. It was either an oversight
or a bug in older rust versions where `SqlType=Array<ST>` didn't count
as the type being constrained. The `HasSqlType` constraints are
sufficiently covered elsewhere (and frankly, I'm fairly certain that
trait is useless and can be removed). It should be noted that we don't
have a compile-test covering that case though, as pg is the only backend
with additional types.
The `SelectableExpression` trait serves two purposes for us. The first
and most important role it fills is to ensure that columns from tables
that aren't in the from clause cannot be used. The second way that we
use it to make columns which are on the right side of a left outer join
be nullable.

There were two reasons that we used a type parameter instead of an
associated type. The first was to make it so that `(Nullable<X>,
Nullable<Y>)` could be treated as `Nullable<(X, Y)>`. We did this
because the return type of `users.left_outer_join(posts)` should be
`(User, Option<Post>)`, not `(User, Post)` where every field of `Post`
is an `Option`.

Since we now provide a `.nullable()` method in the core DSL, I think we
can simply require calling that method explicitly if you want that tuple
conversion to occur. I think that the most common time that conversion
will even be used is when the default select clause is used, where we
can just handle it for our users automatically.

The other reason that we went with a type parameter originally was that
it was easier, since we can provide a default value for a type parameter
but not an associated type. This turned out to actually be a drawback,
as it led to #104. This PR actually brings back aspects of that issue,
which I'll get to in a moment.

It's expected that any expression which implements
`SelectableExpression<QS>` have a `T: SelectableExpression<QS>` bound
for each of its parts. The problem is, the missing second parameter is
defaulting to `T::SqlType`, which means we are implicitly saying that
this bound only applies for `QS` which does not change the SQL type
(anything except a left outer join). This ultimately led to #621.

However, with our current structure, it is impossible to fix #621
without re-introducing at least some aspects of #104. In
#104 (comment) I
said that we didn't need to worry about `1 + NULL`, because we didn't
implement add for any nullable types. However, I'm not sure I considered
joins when I made that statement. The statement applied to joins
previously because of that implicit "sql type doesn't change"
constraint. This commit removes that constraint, meaning #104 will be
back at least when the nullability comes from being on the right side of
a left join.

I don't think this is a serious enough issue that we need to immediately
address it, as the types of queries which would cause the issue still
just don't happen in practice. We should come up with a long term plan
for it, though. Ultimately the nullability of a field really only
matters in the select clause. Since any operation on null returns null,
and you basically want null to act as false in the where clasue, it
doesn't matter there.

So one partial step we could take is to break this out into two separate
traits. One for the "make sure this is valid given the from clause", and
one for the "make this nullable sometimes" case and only constrain on
the first one in the where clause. We could then re-add the "sql type
doesn't change" constraint on the problem cases, which will bring back
aspects of #621, but only for select clauses which is a smaller problem.

I'm not sure if I ultimately want to go the two traits route or not. If
nothing else, the problem cases are much more obvious with this commit.
Anywhere that has `type SqlTypeForSelect = Self::SqlType` is likely a
problem case when joins are involved. This will make it easier to find
all the places to apply a solution when I come up with one that I'm
happy with.

Fixes #621.
Copy link
Member

@killercup killercup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, there's a lot of stuff in here. Changing SelectableExpression is a lot of churn, and a lot of compromises, but your reasoning is sound. Changing this now and having #621 while knowing the drawbacks (I trust you when you say these cases are rare) sounds good. By the way, do you think it'd make sense to actually add tests to document/assert the expected, broken behavior?

/// Indicates that an expression can be selected from a source. The associated
/// type is usually the same as `Expression::SqlType`, but is used to indicate
/// that a column is always nullable when it appears on the right side of a left
/// outer join, even if it wasn't nullable to begin with.
///
/// Columns will implement this for their table. Certain special types, like
/// `CountStar` and `Bound` will implement this for all sources. All other
/// expressions will inherit this from their children.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inherit this from their children

Only noticed this now. Nicely put, will make people used to classical OOP with inheritance nervous, though 😄

let source = users::table.select(max(posts::id));
//~^ ERROR E0277
let source = users::table.select(min(posts::id));
//~^ ERROR E0277
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add let source = users::table.select(sum(user::id)); (with no error) to prevent false positives.

@@ -20,9 +20,6 @@ impl<'a, Expr> Aliased<'a, Expr> {
}
}

#[derive(Debug, Copy, Clone)]
pub struct FromEverywhere;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, was this ever used for anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to be, yeah. I don't know when it stopped being used.

@@ -57,7 +54,9 @@ impl<'a, T> QueryId for Aliased<'a, T> {
// FIXME This is incorrect, should only be selectable from WithQuerySource
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still true.

("Sean".to_string(), Some("Hello".to_string())),
];
let source = users::table.left_outer_join(posts::table)
.select((users::name, posts::title))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@sgrif
Copy link
Member Author

sgrif commented Feb 15, 2017

By the way, do you think it'd make sense to actually add tests to document/assert the expected, broken behavior?

An example broken case is users::table.left_outer_join(posts::table).select(users::id + posts::id). I'm not sure that it's worth adding a case to the repo for it.

@sgrif sgrif merged commit 37c82db into master Feb 16, 2017
@sgrif sgrif deleted the sg-selectable-expression-associated-type branch February 16, 2017 17:55
sgrif added a commit that referenced this pull request Feb 26, 2017
The change in #709 had the side effect of re-introducing #104.
With the design that we have right now, nullability isn't propagating
upwards. This puts the issue of "expressions aren't validating that the
type of its arguments haven't become nullable, and thus nulls are
slipping in where they shouldn't be" at odds with "we can't use complex
expressions in filters for joins because the SQL type changed".

This semi-resolves the issue by restricting when we care about
nullability. Ultimately the only time it really matters is when we're
selecting data, as we need to enforce that the result goes into an
`Option`. For places where we don't see the bytes in Rust (filter,
order, etc), `NULL` is effectively `false`.

This change goes back to fully fixing #104, but brings back a small
piece of #621. I've changed everything that is a composite expression to
only be selectable if the SQL type hasn't changed. This means that you
won't be able to do things like
`users.left_outer_join(posts).select(posts::id + 1)`, but you will be
able to use whatever you want in `filter`.

This change is also to support what I think will fix the root of all
these issues. The design of "Here's the SQL type on this query source"
is just fundamentally not what we need. There is only one case where the
type changes, and that is to become null when it is on the right side of
a left join, the left side of a right join, or either side of a full
join.

One of the changes that #709 made was to require that you explicitly
call `.nullable()` on a tuple if you wanted to get `Option<(i32,
String)>` instead of `(Option<i32>, Option<String>)`. This has worked
out fine, and isn't a major ergonomic pain. The common case is just to
use the default select clause anyway. So I want to go further down this
path.

The longer term plan is to remove `SqlTypeForSelect` entirely, and *not*
implement `SelectableExpression` for columns on the nullable side of a
join. We will then provide these two blanket impls:

```rust
impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<T> where T: SelectableExpression<Right>,
{}

impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<Cons<Head, Tail>> where
        Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>,
        Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>,
{}
```

(Note: Those impls overlap. Providing them as blanket impls would
require rust-lang/rust#40097. Providing them as
non-blanket impls would require us to mark `Nullable` and possibly
`Cons` as `#[fundamental]`)

The end result will be that nullability naturally propagates as we want
it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing
`select(lower(posts::name).nullable())` will work. `lower(posts::name)`
will fail because `posts::name` doesn't impl `SelectableExpression`.
`lower(posts::name.nullable())` will fail because while
`SelectableExpression` will be met, the SQL type of the argument isn't
what's expected. Putting `.nullable` at the very top level naturally
follows SQL's semantics here.
sgrif added a commit that referenced this pull request Feb 26, 2017
The change in #709 had the side effect of re-introducing #104.
With the design that we have right now, nullability isn't propagating
upwards. This puts the issue of "expressions aren't validating that the
type of its arguments haven't become nullable, and thus nulls are
slipping in where they shouldn't be" at odds with "we can't use complex
expressions in filters for joins because the SQL type changed".

This semi-resolves the issue by restricting when we care about
nullability. Ultimately the only time it really matters is when we're
selecting data, as we need to enforce that the result goes into an
`Option`. For places where we don't see the bytes in Rust (filter,
order, etc), `NULL` is effectively `false`.

This change goes back to fully fixing #104, but brings back a small
piece of #621. I've changed everything that is a composite expression to
only be selectable if the SQL type hasn't changed. This means that you
won't be able to do things like
`users.left_outer_join(posts).select(posts::id + 1)`, but you will be
able to use whatever you want in `filter`.

This change is also to support what I think will fix the root of all
these issues. The design of "Here's the SQL type on this query source"
is just fundamentally not what we need. There is only one case where the
type changes, and that is to become null when it is on the right side of
a left join, the left side of a right join, or either side of a full
join.

One of the changes that #709 made was to require that you explicitly
call `.nullable()` on a tuple if you wanted to get `Option<(i32,
String)>` instead of `(Option<i32>, Option<String>)`. This has worked
out fine, and isn't a major ergonomic pain. The common case is just to
use the default select clause anyway. So I want to go further down this
path.

The longer term plan is to remove `SqlTypeForSelect` entirely, and *not*
implement `SelectableExpression` for columns on the nullable side of a
join. We will then provide these two blanket impls:

```rust
impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<T> where T: SelectableExpression<Right>,
{}

impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<Cons<Head, Tail>> where
        Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>,
        Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>,
{}
```

(Note: Those impls overlap. Providing them as blanket impls would
require rust-lang/rust#40097. Providing them as
non-blanket impls would require us to mark `Nullable` and possibly
`Cons` as `#[fundamental]`)

The end result will be that nullability naturally propagates as we want
it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing
`select(lower(posts::name).nullable())` will work. `lower(posts::name)`
will fail because `posts::name` doesn't impl `SelectableExpression`.
`lower(posts::name.nullable())` will fail because while
`SelectableExpression` will be met, the SQL type of the argument isn't
what's expected. Putting `.nullable` at the very top level naturally
follows SQL's semantics here.
sgrif added a commit that referenced this pull request Mar 3, 2017
The change in #709 had the side effect of re-introducing #104.
With the design that we have right now, nullability isn't propagating
upwards. This puts the issue of "expressions aren't validating that the
type of its arguments haven't become nullable, and thus nulls are
slipping in where they shouldn't be" at odds with "we can't use complex
expressions in filters for joins because the SQL type changed".

This semi-resolves the issue by restricting when we care about
nullability. Ultimately the only time it really matters is when we're
selecting data, as we need to enforce that the result goes into an
`Option`. For places where we don't see the bytes in Rust (filter,
order, etc), `NULL` is effectively `false`.

This change goes back to fully fixing #104, but brings back a small
piece of #621. I've changed everything that is a composite expression to
only be selectable if the SQL type hasn't changed. This means that you
won't be able to do things like
`users.left_outer_join(posts).select(posts::id + 1)`, but you will be
able to use whatever you want in `filter`.

This change is also to support what I think will fix the root of all
these issues. The design of "Here's the SQL type on this query source"
is just fundamentally not what we need. There is only one case where the
type changes, and that is to become null when it is on the right side of
a left join, the left side of a right join, or either side of a full
join.

One of the changes that #709 made was to require that you explicitly
call `.nullable()` on a tuple if you wanted to get `Option<(i32,
String)>` instead of `(Option<i32>, Option<String>)`. This has worked
out fine, and isn't a major ergonomic pain. The common case is just to
use the default select clause anyway. So I want to go further down this
path.

The longer term plan is to remove `SqlTypeForSelect` entirely, and *not*
implement `SelectableExpression` for columns on the nullable side of a
join. We will then provide these two blanket impls:

```rust
impl<Left, Right, T> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<T> where T: SelectableExpression<Right>,
{}

impl<Left, Right, Head, Tail> SelectableExpression<LeftOuterJoin<Left, Right>>
    for Nullable<Cons<Head, Tail>> where
        Nullable<Head>: SelectableExpression<LeftOuterJoin<Left, Right>>,
        Nullable<Tail>: SelectableExpression<LeftOuterJoin<Left, Right>>,
{}
```

(Note: Those impls overlap. Providing them as blanket impls would
require rust-lang/rust#40097. Providing them as
non-blanket impls would require us to mark `Nullable` and possibly
`Cons` as `#[fundamental]`)

The end result will be that nullability naturally propagates as we want
it to. Given `sql_function!(lower, lower_t, (x: Text) -> Text)`, doing
`select(lower(posts::name).nullable())` will work. `lower(posts::name)`
will fail because `posts::name` doesn't impl `SelectableExpression`.
`lower(posts::name.nullable())` will fail because while
`SelectableExpression` will be met, the SQL type of the argument isn't
what's expected. Putting `.nullable` at the very top level naturally
follows SQL's semantics here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Predicates that reference a column on the right side of a left outer join cannot be used
2 participants