-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14596][SQL] Remove not used SqlNewHadoopRDD and some more unused imports #12354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #55707 has finished for PR 12354 at commit
|
|
cc @cloud-fan |
|
There is a |
|
Thanks! I just renamed. |
|
Test build #55708 has finished for PR 12354 at commit
|
|
Test build #55709 has finished for PR 12354 at commit
|
| split.inputSplit.value match { | ||
| case fs: FileSplit => SqlNewHadoopRDDState.setInputFileName(fs.getPath.toString) | ||
| case _ => SqlNewHadoopRDDState.unsetInputFileName() | ||
| case fs: FileSplit => FileScanRDDState.setInputFileName(fs.getPath.toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liancheng @yhuai Do you still use HadoopRDD to read data source relation? If not, I think we don't need to update file name here anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is used by HiveTableScan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I see. This is used in https://github.com/apache/spark/blob/d6dc12ef0146ae409834c78737c116050961f350/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/InputFileName.scala during code generation.
Should I remove the change of file name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see, how about renaming it to InputFileNameHolder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Thanks.
| * TODO: Move/Combine this with org.apache.spark.sql.datasources.SqlNewHadoopRDD | ||
| * State for FileScanRDD objects. This is split this way because of the package splits. | ||
| */ | ||
| private[spark] object SqlNewHadoopRDDState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SqlNewHadoopRDDState is definitely not a good name here, how about InputFileNameHolder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thanks!
|
@cloud-fan The commits I just submitted include the changes for MiMa tests and some comments. |
|
Test build #55772 has finished for PR 12354 at commit
|
|
Test build #55769 has finished for PR 12354 at commit
|
|
retest this please |
|
Test build #55784 has finished for PR 12354 at commit
|
|
thanks! merging to master! |
What changes were proposed in this pull request?
Old
HadoopFsRelationAPI includesbuildInternalScan()which usesSqlNewHadoopRDDinParquetRelation.Because now the old API is removed,
SqlNewHadoopRDDis not used anymore.So, this PR removes
SqlNewHadoopRDDand several unused imports.This was discussed in #12326.
How was this patch tested?
Several related existing unit tests and
sbt scalastyle.