Skip to content

Conversation

@amaliujia
Copy link
Contributor

@amaliujia amaliujia commented Oct 12, 2022

What changes were proposed in this pull request?

  1. Add sample to proto and connect DSL.
  2. Add the missing plan comparison for column alias test case.

Why are the changes needed?

Improve API coverage on the connect proto.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT

@amaliujia
Copy link
Contributor Author

R: @cloud-fan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of these parameters required? Or are some optional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all required for the Sample query plan, but df.sample API has multiple overloads which provide default values for some parameters. Should the proto plan match query plan or DF API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can go with the query plan for now and if we want to introduce optional behavior it's going to be easy to do because the proto supports optionality by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one more sentence of what the sampling does?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

Are you good with this @grundprinzip ?

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in d0ab83c Oct 18, 2022
SandishKumarHN pushed a commit to SandishKumarHN/spark that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?

1. Add sample to proto and connect DSL.
2. Add the missing plan comparison for column alias test case.

### Why are the changes needed?

Improve API coverage on the connect proto.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes apache#38227 from amaliujia/support_sample.

Authored-by: Rui Wang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants