-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40774][CONNECT] Add Sample to proto and Connect DSL #38227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
R: @cloud-fan |
connector/connect/src/main/protobuf/spark/connect/relations.proto
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of these parameters required? Or are some optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all required for the Sample query plan, but df.sample API has multiple overloads which provide default values for some parameters. Should the proto plan match query plan or DF API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can go with the query plan for now and if we want to introduce optional behavior it's going to be easy to do because the proto supports optionality by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one more sentence of what the sampling does?
connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
Outdated
Show resolved
Hide resolved
|
Can one of the admins verify this patch? |
connector/connect/src/main/protobuf/spark/connect/relations.proto
Outdated
Show resolved
Hide resolved
|
Are you good with this @grundprinzip ? |
4f3b1a6 to
33176d0
Compare
33176d0 to
a3279eb
Compare
|
thanks, merging to master! |
### What changes were proposed in this pull request? 1. Add sample to proto and connect DSL. 2. Add the missing plan comparison for column alias test case. ### Why are the changes needed? Improve API coverage on the connect proto. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT Closes apache#38227 from amaliujia/support_sample. Authored-by: Rui Wang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Why are the changes needed?
Improve API coverage on the connect proto.
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT