-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-23260][SPARK-23262][SQL] several data source v2 naming cleanup #20427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * case-insensitive string-to-string map. | ||
| */ | ||
| DataSourceV2Reader createReader(DataSourceV2Options options); | ||
| DataSourceReader createReader(DataSourceV2Options options); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not rename options as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on @rdblue's comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, let me rename it too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only need V2 in the root interface: DataSourceV2
| */ | ||
| @InterfaceStability.Evolving | ||
| public interface SessionConfigSupport { | ||
| public interface SessionConfigSupport extends DataSourceV2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to extend DataSourceV2? Why add this in a commit that appears to be nothing more than a rename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for @rdblue 's question.
@cloud-fan . Could you add more explanation about this change at PR description? According to the following, do we need to update Apache JIRA issue types from IMPROVEMENT to BUG?
Also this PR fixes some places that the mix-in interface doesn't extend the interface it aimed to mix in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something we left behind. For a mix-in interface, it should extend the interface it aimed to mix in, so that we can guarantee, for example, classes implement SessionConfigSupport must also implement DataSourceV2.
We already did it in some mix-in interfaces, e.g. the streaming read support, SupportsScanUnsafeRow, SupportsScanColumnarBatch, etc. In this PR I just wanna fix the remaining ones, to make them consistent.
If you feel strongly about it, I can create a JIRA ticket for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to keep commits clean and focused. I'd say create a new JIRA for it and do all of the mix-ins at once.
+1 when this is separated to its own PR. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created https://issues.apache.org/jira/browse/SPARK-23262 .
I'd like to keep the changes in this PR, as it's just several lines. It's not ideal but we are rushing for the 2.3 release, I wanna speed this up a little, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixing large migration commits like this one with unrelated changes makes it harder to pick or revert changes without unintended side-effects. What happens if we realize that this rename was a bad idea? Reverting this commit would also revert the constraint that SessionConfigSupport extends DataSourceV2. Similarly, if we realize that these mix-ins don't need to extend DataSourceV2, then we would have to find and remove them all instead of reverting a commit. That might even sound okay, but when you're picking commits deliberately to patch branches, you need to make as few changes as possible and cherry-pick conflicts make that much harder.
The fact that you're rushing to get commits into 2.3 is even more concerning and reason to be careful, not a reason to relax our standards. Please move this to its own PR and fix all of the interfaces at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping me on the new PR. I'm happy to review it (though it is non-binding).
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (with some minor comments)
|
Test build #86779 has finished for PR 20427 at commit
|
|
Test build #86789 has finished for PR 20427 at commit
|
|
Retest this please |
gatorsmile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| @@ -23,7 +23,7 @@ import org.apache.spark.sql.sources.v2.reader._ | |||
|
|
|||
| case class DataSourceV2Relation( | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider removing V2 in DataSourceV2Relation and StreamingDataSourceV2Relation as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is internal, we can clean it up at any time. I wanna focus on public APIs in this PR.
|
Test build #86793 has finished for PR 20427 at commit
|
|
retest this please. |
|
LGTM |
gengliangwang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Test build #86805 has finished for PR 20427 at commit
|
|
thanks, merging to master/2.3! |
## What changes were proposed in this pull request? All other classes in the reader/writer package doesn't have `V2` in their names, and the streaming reader/writer don't have `V2` either. It's more consistent to remove `V2` from `DataSourceV2Reader` and `DataSourceVWriter`. Also rename `DataSourceV2Option` to remote the `V2`, we should only have `V2` in the root interface: `DataSourceV2`. This PR also fixes some places that the mix-in interface doesn't extend the interface it aimed to mix in. ## How was this patch tested? existing tests. Author: Wenchen Fan <[email protected]> Closes #20427 from cloud-fan/ds-v2. (cherry picked from commit 0a9ac02) Signed-off-by: Wenchen Fan <[email protected]>
All other classes in the reader/writer package doesn't have `V2` in their names, and the streaming reader/writer don't have `V2` either. It's more consistent to remove `V2` from `DataSourceV2Reader` and `DataSourceVWriter`. Also rename `DataSourceV2Option` to remote the `V2`, we should only have `V2` in the root interface: `DataSourceV2`. This PR also fixes some places that the mix-in interface doesn't extend the interface it aimed to mix in. existing tests. Author: Wenchen Fan <[email protected]> Closes apache#20427 from cloud-fan/ds-v2.
What changes were proposed in this pull request?
All other classes in the reader/writer package doesn't have
V2in their names, and the streaming reader/writer don't haveV2either. It's more consistent to removeV2fromDataSourceV2ReaderandDataSourceVWriter.Also rename
DataSourceV2Optionto remote theV2, we should only haveV2in the root interface:DataSourceV2.This PR also fixes some places that the mix-in interface doesn't extend the interface it aimed to mix in.
How was this patch tested?
existing tests.