Skip to content

Commit 4243bb6

Browse files
josephsumarmbrus
authored andcommitted
[SPARK-3011][SQL] _temporary directory should be filtered out by sqlContext.parquetFile
fix compile error on hadoop 0.23 for the pull request apache#1924. Author: Chia-Yung Su <[email protected]> Closes apache#1959 from joesu/bugfix-spark3011 and squashes the following commits: be30793 [Chia-Yung Su] remove .* and _* except _metadata 8fe2398 [Chia-Yung Su] add note to explain 40ea9bd [Chia-Yung Su] fix hadoop-0.23 compile error c7e44f2 [Chia-Yung Su] match syntax f8fc32a [Chia-Yung Su] filter out tmp dir
1 parent 507a1b5 commit 4243bb6

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -378,7 +378,7 @@ private[parquet] object ParquetTypesConverter extends Logging {
378378

379379
val children = fs.listStatus(path).filterNot { status =>
380380
val name = status.getPath.getName
381-
name(0) == '.' || name == FileOutputCommitter.SUCCEEDED_FILE_NAME
381+
(name(0) == '.' || name(0) == '_') && name != ParquetFileWriter.PARQUET_METADATA_FILE
382382
}
383383

384384
// NOTE (lian): Parquet "_metadata" file can be very slow if the file consists of lots of row

0 commit comments

Comments
 (0)