Skip to content

Conversation

@amaliujia
Copy link
Contributor

What changes were proposed in this pull request?

Support SELECT * in an explicit way by connect proto.

Why are the changes needed?

Current proto uses empty project list for SELECT *. However, this is an implicit way that it is hard to differentiate not set and set but empty (the latter is invalid plan). For longer term proto compatibility, we should always use explicit fields for passing through information.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

@amaliujia amaliujia changed the title [SPARK-40587][CONNECT] Support SELECT * in an explicit way by connect proto [SPARK-40587][CONNECT] Support SELECT * in an explicit way in connect proto Sep 27, 2022

// represent * (e.g. SELECT *)
message UnresolvedStar {
}
Copy link
Contributor Author

@amaliujia amaliujia Sep 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to expand this message later to support full fields of the star:

case class UnresolvedStar(target: Option[Seq[String]]) extends Star with Unevaluable {

which will be used to support Struct.

Copy link
Contributor Author

@amaliujia amaliujia Sep 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one example is structDf.groupBy($"a").agg(min(struct($"record.*"))).sort("a").first()

@amaliujia
Copy link
Contributor Author

Copy link
Contributor

@grundprinzip grundprinzip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit on comment.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@amaliujia
Copy link
Contributor Author

@cloud-fan friendly ping

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in ccb837f Oct 5, 2022
@amaliujia amaliujia deleted the SPARK-40587 branch October 5, 2022 06:21
Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

cloud-fan pushed a commit that referenced this pull request Oct 19, 2022
…ect *

### What changes were proposed in this pull request?

1. Sync newest proto to python client.
2. Update Aggregate to match proto change.
3. Change `select` to have it accept both `column` and `str`
4. Make sure `*` pass through the entire path which has been implemented on the server side #38023

### Why are the changes needed?

Update python client side to match the change in connect proto.

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

UT

Closes #38218 from amaliujia/select_start_in_python.

Authored-by: Rui Wang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
…ect *

### What changes were proposed in this pull request?

1. Sync newest proto to python client.
2. Update Aggregate to match proto change.
3. Change `select` to have it accept both `column` and `str`
4. Make sure `*` pass through the entire path which has been implemented on the server side apache#38023

### Why are the changes needed?

Update python client side to match the change in connect proto.

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

UT

Closes apache#38218 from amaliujia/select_start_in_python.

Authored-by: Rui Wang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants