Skip to content

Commit 73da9c2

Browse files
ueshinmarmbrus
authored andcommitted
[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent from breaking binary-compatibility.
Original problem is [SPARK-3764](https://issues.apache.org/jira/browse/SPARK-3764). `AppendingParquetOutputFormat` uses a binary-incompatible method `context.getTaskAttemptID`. This causes binary-incompatible of Spark itself, i.e. if Spark itself is built against hadoop-1, the artifact is for only hadoop-1, and vice versa. Author: Takuya UESHIN <[email protected]> Closes #2638 from ueshin/issues/SPARK-3771 and squashes the following commits: efd3784 [Takuya UESHIN] Add a comment to explain the reason to use reflection. ec213c1 [Takuya UESHIN] Use reflection to prevent breaking binary-compatibility.
1 parent d3cdf91 commit 73da9c2

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -331,13 +331,21 @@ private[parquet] class AppendingParquetOutputFormat(offset: Int)
331331

332332
// override to choose output filename so not overwrite existing ones
333333
override def getDefaultWorkFile(context: TaskAttemptContext, extension: String): Path = {
334-
val taskId: TaskID = context.getTaskAttemptID.getTaskID
334+
val taskId: TaskID = getTaskAttemptID(context).getTaskID
335335
val partition: Int = taskId.getId
336336
val filename = s"part-r-${partition + offset}.parquet"
337337
val committer: FileOutputCommitter =
338338
getOutputCommitter(context).asInstanceOf[FileOutputCommitter]
339339
new Path(committer.getWorkPath, filename)
340340
}
341+
342+
// The TaskAttemptContext is a class in hadoop-1 but is an interface in hadoop-2.
343+
// The signatures of the method TaskAttemptContext.getTaskAttemptID for the both versions
344+
// are the same, so the method calls are source-compatible but NOT binary-compatible because
345+
// the opcode of method call for class is INVOKEVIRTUAL and for interface is INVOKEINTERFACE.
346+
private def getTaskAttemptID(context: TaskAttemptContext): TaskAttemptID = {
347+
context.getClass.getMethod("getTaskAttemptID").invoke(context).asInstanceOf[TaskAttemptID]
348+
}
341349
}
342350

343351
/**

0 commit comments

Comments
 (0)