-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-8121] [SQL] Fixes InsertIntoHadoopFsRelation job initialization for Hadoop 1.x #6669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @yhuai |
046a01e to
910b8aa
Compare
|
Test build #34264 has finished for PR 6669 at commit
|
|
Test build #34267 has finished for PR 6669 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we have protected def configuration = sparkContext.hadoopConfiguration in SQLTestUtils and we are not cloning it at here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thanks!
|
Test build #34344 has finished for PR 6669 at commit
|
|
Test build #34355 has finished for PR 6669 at commit
|
|
Test build #34390 has finished for PR 6669 at commit
|
|
Should we log the output committer used by parquet? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems not right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can get the instance of outputCommitter first and then log its class?
val outputCommitter = ...
logInfo(...)
outputCommitter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thanks!
|
Test build #34407 has finished for PR 6669 at commit
|
|
Test build #34431 has finished for PR 6669 at commit
|
|
LGTM. I am merging it to master. |
…n for Hadoop 1.x (branch 1.4 backport based on #6669)
…n for Hadoop 1.x For Hadoop 1.x, `TaskAttemptContext` constructor clones the `Configuration` argument, thus configurations done in `HadoopFsRelation.prepareForWriteJob()` are not populated to *driver* side `TaskAttemptContext` (executor side configurations are properly populated). Currently this should only affect Parquet output committer class configuration. Author: Cheng Lian <[email protected]> Closes apache#6669 from liancheng/spark-8121 and squashes the following commits: 73819e8 [Cheng Lian] Minor logging fix fce089c [Cheng Lian] Adds more logging b6f78a6 [Cheng Lian] Fixes compilation error introduced while rebasing 963a1aa [Cheng Lian] Addresses @yhuai's comment c3a0b1a [Cheng Lian] Fixes InsertIntoHadoopFsRelation job initialization
For Hadoop 1.x,
TaskAttemptContextconstructor clones theConfigurationargument, thus configurations done inHadoopFsRelation.prepareForWriteJob()are not populated to driver sideTaskAttemptContext(executor side configurations are properly populated). Currently this should only affect Parquet output committer class configuration.