-
Notifications
You must be signed in to change notification settings - Fork 29.2k
[SPARK-20594][SQL]The staging directory should be a child directory starts with "." to avoid being deleted if we set hive.exec.stagingdir under the table directory. #17858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
c154f3a
de938ed
6b22d3e
6b1b153
2a542e4
9f41436
bf1b4ec
4e1b6a0
639d63a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -97,12 +97,23 @@ case class InsertIntoHiveTable( | |
| val inputPathUri: URI = inputPath.toUri | ||
| val inputPathName: String = inputPathUri.getPath | ||
| val fs: FileSystem = inputPath.getFileSystem(hadoopConf) | ||
| val stagingPathName: String = | ||
| var stagingPathName: String = | ||
| if (inputPathName.indexOf(stagingDir) == -1) { | ||
| new Path(inputPathName, stagingDir).toString | ||
| } else { | ||
| inputPathName.substring(0, inputPathName.indexOf(stagingDir) + stagingDir.length) | ||
| } | ||
|
|
||
| // SPARK-20594: The staging directory should be a child directory starts with "." to avoid | ||
| // being deleted if we set hive.exec.stagingdir under the table directory. | ||
| if (FileUtils.isSubDir(new Path(stagingPathName), inputPath, fs) | ||
| && !stagingPathName.stripPrefix(inputPathName).startsWith(".")) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is just to hide the issue and make the test cases passed, right? We need to drop the created staging directory no matter what is the value users set.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry i do not follow your logic. Correct me if I'm wrong, but isn't the logic of dropping the created staging directory was already there before with
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| logDebug(s"The staging dir '$stagingPathName' should be a child directory starts " + | ||
| s"with '.' to avoid being deleted if we set hive.exec.stagingdir under the table " + | ||
| s"directory.") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: please remove the above TWO string Interpolation |
||
| stagingPathName = new Path(inputPathName, ".hive-staging").toString | ||
| } | ||
|
|
||
| val dir: Path = | ||
| fs.makeQualified( | ||
| new Path(stagingPathName + "_" + executionId + "-" + TaskRunner.getTaskRunnerID)) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -494,4 +494,45 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef | |
| spark.table("t").write.insertInto(tableName) | ||
| } | ||
| } | ||
|
|
||
| private def dropTables(tableNames: String*): Unit = { | ||
| tableNames.foreach { name => | ||
| sql(s"DROP TABLE IF EXISTS $name") | ||
| } | ||
| } | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not needed. You can call
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
|
|
||
| test( | ||
| """SPARK-20594: The staging directory should be appended with ".hive-staging" | ||
| |to avoid being deleted if we set hive.exec.stagingdir under the table directory | ||
| |without start with "."""".stripMargin) { | ||
|
|
||
| dropTables("test_table", "test_table1") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. withTable("test_table", "test_table1") {
...
}
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes you are right. :) |
||
|
|
||
| sql("CREATE TABLE test_table (key int, value string)") | ||
|
|
||
| // Add some data. | ||
| testData.write.mode(SaveMode.Append).insertInto("test_table") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can simplify the above two lines by
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no, as i tested we must create table rather than simplify the above two lines by |
||
|
|
||
| // Make sure the table has also been updated. | ||
| checkAnswer( | ||
| sql("SELECT * FROM test_table"), | ||
| testData.collect().toSeq | ||
| ) | ||
|
|
||
| sql("CREATE TABLE test_table1 (key int, value string)") | ||
|
|
||
| // Set hive.exec.stagingdir under the table directory without start with ".". | ||
| sql("set hive.exec.stagingdir=./test") | ||
|
|
||
| // Now overwrite. | ||
| sql("INSERT OVERWRITE TABLE test_table1 SELECT * FROM test_table") | ||
|
|
||
| // Make sure the table has also been updated. | ||
| checkAnswer( | ||
| sql("SELECT * FROM test_table1"), | ||
| testData.collect().toSeq | ||
| ) | ||
|
|
||
| dropTables("test_table", "test_table1") | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i will update it . Thanks!