[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent from breaking binary-compatibility. #2638

ueshin · 2014-10-02T23:42:07Z

Original problem is SPARK-3764.

AppendingParquetOutputFormat uses a binary-incompatible method context.getTaskAttemptID.
This causes binary-incompatible of Spark itself, i.e. if Spark itself is built against hadoop-1, the artifact is for only hadoop-1, and vice versa.

AmplabJenkins · 2014-10-02T23:57:20Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21231/

SparkQA · 2014-10-03T02:25:58Z

QA tests have started for PR 2638 at commit ec213c1.

This patch merges cleanly.

SparkQA · 2014-10-03T03:15:42Z

QA tests have finished for PR 2638 at commit ec213c1.

This patch passes unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- println(s"Failed to load main class $childMainClass.")
- case class GetPeers(blockManagerId: BlockManagerId) extends ToBlockManagerMaster

srowen · 2014-10-03T12:45:08Z

A particular instance of Spark will be built for a particular version of Hadoop and/or YARN. It is not at this point a universal binary anyway, and so, I do not think it is necessary to add this indirection via reflection. That is, if you are deploying on Hadoop 1, you need to build Spark for Hadoop 1, and similarly for Hadoop 2.

ueshin · 2014-10-03T14:06:51Z

@srowen, Thank you for your comment.
Indeed, when deploy completed apps to Spark cluster, there is a particular instance of Spark.
But Spark app developers will use artifacts in Maven Central while developing and unit-testing. The artifacts seem to be built for Hadoop 2, so if they want to test with Hadoop 1, it won't work.
What do you think?

marmbrus · 2014-10-09T19:37:11Z

@ueshin I'm not sure I fully understand. What are the two method signatures in question such that it compiles but then fails at runtime. Can you perhaps include these details in a comment?

@srowen are you satisfied with that explanation?

AmplabJenkins · 2014-10-09T20:32:44Z

Can one of the admins verify this patch?

ueshin · 2014-10-09T23:59:25Z

@marmbrus, Thank you for your comment.
The TaskAttemptContext is a class in hadoop-1 but is an interface in hadoop-2.
The signatures of the method TaskAttemptContext.getTaskAttemptID for the both versions are the same, so the method calls are source-compatible but NOT binary-compatible because the opcode of method call for class is INVOKEVIRTUAL and for interface is INVOKEINTERFACE.

SparkQA · 2014-10-10T01:55:02Z

QA tests have started for PR 2638 at commit efd3784.

This patch merges cleanly.

SparkQA · 2014-10-10T02:41:28Z

QA tests have finished for PR 2638 at commit efd3784.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-10T02:41:31Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21566/Test PASSed.

marmbrus · 2014-10-13T20:44:40Z

Thanks! Merged.

Use reflection to prevent breaking binary-compatibility.

ec213c1

ueshin changed the title ~~[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent breaking binary-compatibility.~~ [SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent from breaking binary-compatibility. Oct 3, 2014

ueshin mentioned this pull request Oct 7, 2014

[SPARK-3812] [BUILD] Adapt maven build to publish effective pom. #2673

Closed

Add a comment to explain the reason to use reflection.

efd3784

asfgit closed this in 73da9c2 Oct 13, 2014

[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent from breaking binary-compatibility. #2638

[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection to prevent from breaking binary-compatibility. #2638

Uh oh!

Conversation

ueshin commented Oct 2, 2014

Uh oh!

AmplabJenkins commented Oct 2, 2014

Uh oh!

SparkQA commented Oct 3, 2014

Uh oh!

SparkQA commented Oct 3, 2014

Uh oh!

srowen commented Oct 3, 2014

Uh oh!

ueshin commented Oct 3, 2014

Uh oh!

marmbrus commented Oct 9, 2014

Uh oh!

AmplabJenkins commented Oct 9, 2014

Uh oh!

ueshin commented Oct 9, 2014

Uh oh!

SparkQA commented Oct 10, 2014

Uh oh!

SparkQA commented Oct 10, 2014

Uh oh!

AmplabJenkins commented Oct 10, 2014

Uh oh!

marmbrus commented Oct 13, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants