[SPARK-53507][CONNECT] Add breaking change info to errors #52256

imarkowitz · 2025-09-05T23:43:40Z

What changes were proposed in this pull request?

Adds breaking change metadata to error messages

Each breaking change includes a migration message explaining how the user should update their code. It also can include a spark config value which can be used to mitigate the breaking change.

The migration message is concatenated to the error message. In Scala, we also include the breaking change info in the structured error message, when the STANDARD error format is used.

We also include breaking change info in pyspark errors.

Why are the changes needed?

By tagging breaking changes with metadata and a spark config flag, we can build tools to automatically retry spark jobs with the breaking change disabled.

Does this PR introduce any user-facing change?

This PR only adds a framework for creating breaking change errors, but does not define any breaking change errors yet. It adds new methods, for example getBreakingChangeInfo on SparkThrowable. For existing errors, this function will return None.

How was this patch tested?

Tests are added in SparkThrowableSuite, test_connect_errors_conversion.py, test_errors.py, and FetchErrorDetailsHandlerSuite.

Was this patch authored or co-authored using generative AI tooling?

No

common/utils/src/main/java/org/apache/spark/SparkThrowable.java

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

cloud-fan · 2025-09-12T04:50:23Z

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

nit: shall we add an \n between the main error message and the breaking change message?

I think it makes sense to use a space

But this may look weird as the migration message itself can be multi lines

I based this code on the existing logic for joining the subclass message:

spark/common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

Line 111 in 0fb89df

errorInfo.messageTemplate + " " + errorSubInfo.messageTemplate

That logic uses a space so I think it makes sense to match that for consistency.

In the common case where the message is a single line, I think a newline is more confusing than a space.

cloud-fan · 2025-09-12T04:54:19Z

common/utils/src/main/scala/org/apache/spark/SparkThrowableHelper.scala

let's exclude these code style only changes from the PR

Done. These were auto-generated from running ./dev/scalafmt -- is there a better workflow for formatting these changes?

cloud-fan · 2025-09-16T06:14:14Z

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

+private case class ErrorSubInfo(message: Seq[String],
+    breakingChangeInfo: Option[BreakingChangeInfo] = None) {


Suggested change

private case class ErrorSubInfo(message: Seq[String],

breakingChangeInfo: Option[BreakingChangeInfo] = None) {

private case class ErrorSubInfo(

message: Seq[String],

breakingChangeInfo: Option[BreakingChangeInfo] = None) {

cloud-fan · 2025-09-16T06:14:26Z

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

+ *                       mitigated manually.
+ */
+case class BreakingChangeInfo(
+  migrationMessage: Seq[String],


nit: 4 spaces indentation

cloud-fan · 2025-09-16T06:14:35Z

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

+ * @param key The spark config key.
+ * @param value The spark config value that mitigates the breaking change.
+ */
+case class MitigationSparkConfig(key: String, value: String)


Suggested change

case class MitigationSparkConfig(key: String, value: String)

case class MitigationConfig(key: String, value: String)

cloud-fan · 2025-09-16T06:16:36Z

common/utils/src/main/scala/org/apache/spark/SparkThrowableHelper.scala

          g.writeStringField("errorClass", errorClass)
          if (format == STANDARD) {
            g.writeStringField("messageTemplate", errorReader.getMessageTemplate(errorClass))
+            errorReader.getBreakingChangeInfo(errorClass).foreach{ breakingChangeInfo =>


Suggested change

errorReader.getBreakingChangeInfo(errorClass).foreach{ breakingChangeInfo =>

errorReader.getBreakingChangeInfo(errorClass).foreach { breakingChangeInfo =>

cloud-fan · 2025-09-16T06:17:56Z

common/utils/src/main/scala/org/apache/spark/SparkThrowableHelper.scala

+              g.writeStringField("migrationMessage",
+                  breakingChangeInfo.migrationMessage.mkString("\n"))
+              breakingChangeInfo.mitigationSparkConfig.foreach{ mitigationSparkConfig =>
+                g.writeObjectFieldStart("mitigationSparkConfig")


just for my education: this can write JSON array? The JSON writer can recognize duplicated object field names automatically?

This is not an array, it's just an Option

core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala

cloud-fan · 2025-09-17T05:54:24Z

python/pyspark/errors/exceptions/connect.py

    def getGrpcStatusCode(self) -> grpc.StatusCode:
        return self._grpc_status_code

+    def getBreakingChangeInfo(self) -> Optional[Dict[str, Any]]:


shall we add a BreakingChangeInfo class in python as well?

We have a proto class defined in sql/connect/common/src/main/protobuf/spark/connect/base.proto but I didn't want to introduce that as a dependency here

cloud-fan · 2025-09-17T05:57:33Z

sql/connect/common/src/main/protobuf/spark/connect/base.proto

+  message BreakingChangeInfo {
+    // A message explaining how the user can migrate their job to work
+    // with the breaking change.
+    repeated string migration_message = 1;


shall we concatenate the string in the server side? For client's point of view, it's a bit weird to see the message as a list of string.

I think it's fine either way. The migration message is also returned as part of the error message, so I think that's the main way end users would interact with it. Instead of doing a conversion here I thought I would just return the value in the same format it's defined in.

It's a string list in JSON because we don't want to have super long lines in the JSON file. It's not really about semantic and I think we should not inherit this trick in the protobuf message.

cloud-fan

LGTM, cc @HyukjinKwon

imarkowitz · 2025-09-17T20:54:19Z

Test failures in org.apache.spark.sql.kafka010.KafkaMicroBatchV1SourceWithConsumerSuite and org.apache.spark.sql.jdbc.v2.OracleIntegrationSuite look unrelated

HyukjinKwon · 2025-09-17T23:12:53Z

Merged to master.

cloud-fan · 2025-09-25T10:17:18Z

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala

+ *                       If false, the spark job should be retried by setting the
+ *                       mitigationConfig.
+ */
+case class BreakingChangeInfo(


I just realize that we expose it as a public API via SparkThrowable.getBreakingChangeInfo. We shouldn't expose a case class as public API as it has a wide API surface, including the companion object.

We should follow SparkThrowable and define it in Java.

…th `4.1.0-preview2` ### What changes were proposed in this pull request? This PR aims to update Spark Connect-generated Swift source code with Apache Spark `4.1.0-preview2`. ### Why are the changes needed? There are many changes from Apache Spark 4.1.0. - apache/spark#52342 - apache/spark#52256 - apache/spark#52271 - apache/spark#52242 - apache/spark#51473 - apache/spark#51653 - apache/spark#52072 - apache/spark#51561 - apache/spark#51563 - apache/spark#51489 - apache/spark#51507 - apache/spark#51462 - apache/spark#51464 - apache/spark#51442 To use the latest bug fixes and new messages to develop for new features of `4.1.0-preview2`. ``` $ git clone -b v4.1.0-preview2 https://github.com/apache/spark.git $ cd spark/sql/connect/common/src/main/protobuf/ $ protoc --swift_out=. spark/connect/*.proto $ protoc --grpc-swift_out=. spark/connect/*.proto // Remove empty GRPC files $ cd spark/connect $ grep 'This file contained no services' * catalog.grpc.swift:// This file contained no services. commands.grpc.swift:// This file contained no services. common.grpc.swift:// This file contained no services. example_plugins.grpc.swift:// This file contained no services. expressions.grpc.swift:// This file contained no services. ml_common.grpc.swift:// This file contained no services. ml.grpc.swift:// This file contained no services. pipelines.grpc.swift:// This file contained no services. relations.grpc.swift:// This file contained no services. types.grpc.swift:// This file contained no services. $ rm catalog.grpc.swift commands.grpc.swift common.grpc.swift example_plugins.grpc.swift expressions.grpc.swift ml_common.grpc.swift ml.grpc.swift pipelines.grpc.swift relations.grpc.swift types.grpc.swift ``` ### Does this PR introduce _any_ user-facing change? Pass the CIs. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #250 from dongjoon-hyun/SPARK-53777. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

### What changes were proposed in this pull request? Don't use a Scala case class for BreakingChangeInfo and MitigationConfig ### Why are the changes needed? Per comment: #52256 (comment) > [cloud-fan](https://github.com/cloud-fan) [5 days ago](#52256 (comment)) > I just realize that we expose it as a public API via SparkThrowable.getBreakingChangeInfo. We shouldn't expose a case class as public API as it has a wide API surface, including the companion object. > We should follow SparkThrowable and define it in Java. ### Does this PR introduce _any_ user-facing change? No -- interface is almost the same ### How was this patch tested? Updated unit tests ### Was this patch authored or co-authored using generative AI tooling? Used Github co-pilot, mainly for the `equals` and `hashCode` functions Closes #52484 from imarkowitz/ian/breaking-changes-case-class. Lead-authored-by: imarkowitz <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

### What changes were proposed in this pull request? Adds breaking change metadata to error messages Each breaking change includes a migration message explaining how the user should update their code. It also can include a spark config value which can be used to mitigate the breaking change. The migration message is concatenated to the error message. In Scala, we also include the breaking change info in the structured error message, when the STANDARD error format is used. We also include breaking change info in pyspark errors. ### Why are the changes needed? By tagging breaking changes with metadata and a spark config flag, we can build tools to automatically retry spark jobs with the breaking change disabled. ### Does this PR introduce _any_ user-facing change? This PR only adds a framework for creating breaking change errors, but does not define any breaking change errors yet. It adds new methods, for example `getBreakingChangeInfo` on `SparkThrowable`. For existing errors, this function will return `None`. ### How was this patch tested? Tests are added in `SparkThrowableSuite`, `test_connect_errors_conversion.py`, `test_errors.py`, and `FetchErrorDetailsHandlerSuite`. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#52256 from imarkowitz/ian/breaking-changes. Authored-by: imarkowitz <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>

### What changes were proposed in this pull request? Don't use a Scala case class for BreakingChangeInfo and MitigationConfig ### Why are the changes needed? Per comment: apache#52256 (comment) > [cloud-fan](https://github.com/cloud-fan) [5 days ago](apache#52256 (comment)) > I just realize that we expose it as a public API via SparkThrowable.getBreakingChangeInfo. We shouldn't expose a case class as public API as it has a wide API surface, including the companion object. > We should follow SparkThrowable and define it in Java. ### Does this PR introduce _any_ user-facing change? No -- interface is almost the same ### How was this patch tested? Updated unit tests ### Was this patch authored or co-authored using generative AI tooling? Used Github co-pilot, mainly for the `equals` and `hashCode` functions Closes apache#52484 from imarkowitz/ian/breaking-changes-case-class. Lead-authored-by: imarkowitz <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

github-actions bot added SQL CORE PYTHON CONNECT labels Sep 5, 2025

imarkowitz force-pushed the ian/breaking-changes branch 2 times, most recently from dc8434d to 7ff851c Compare September 9, 2025 17:54

imarkowitz changed the title ~~[WIP] [SPARK-53507]Add breaking change info to errors~~ [SPARK-53507]Add breaking change info to errors Sep 9, 2025

cloud-fan reviewed Sep 12, 2025

View reviewed changes

common/utils/src/main/java/org/apache/spark/SparkThrowable.java Outdated Show resolved Hide resolved

cloud-fan reviewed Sep 12, 2025

View reviewed changes

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Sep 12, 2025

View reviewed changes

common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Sep 12, 2025

View reviewed changes

imarkowitz requested a review from cloud-fan September 15, 2025 21:57

Add breaking change info to errors

eba0c8f

imarkowitz force-pushed the ian/breaking-changes branch from c8fe9a3 to eba0c8f Compare September 15, 2025 23:08

imarkowitz added 3 commits September 15, 2025 23:15

Only partially lint

0e89beb

fix

1d53754

fix

0fb89df

cloud-fan reviewed Sep 16, 2025

View reviewed changes

core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala Show resolved Hide resolved

imarkowitz added 4 commits September 16, 2025 16:43

formatting

aaf91fc

rename MitigationSparkConfig -> MitigationConfig

82b63c0

rename autoMitigation -> needsAudit

99d7bb6

format again and gen protos again

d235da2

imarkowitz requested a review from cloud-fan September 17, 2025 03:48

cloud-fan reviewed Sep 17, 2025

View reviewed changes

cloud-fan approved these changes Sep 17, 2025

View reviewed changes

HyukjinKwon approved these changes Sep 17, 2025

View reviewed changes

HyukjinKwon changed the title ~~[SPARK-53507]Add breaking change info to errors~~ [SPARK-53507][CONNECT] Add breaking change info to errors Sep 17, 2025

HyukjinKwon closed this in 010d36f Sep 17, 2025

cloud-fan reviewed Sep 25, 2025

View reviewed changes

imarkowitz mentioned this pull request Sep 30, 2025

[SPARK-53507][CORE] Don't use case class for BreakingChangeInfo #52484

Closed

dongjoon-hyun mentioned this pull request Oct 1, 2025

[SPARK-53777] Update Spark Connect-generated Swift source code with 4.1.0-preview2 apache/spark-connect-swift#250

Closed

		private case class ErrorSubInfo(message: Seq[String],
		breakingChangeInfo: Option[BreakingChangeInfo] = None) {

	case class MitigationSparkConfig(key: String, value: String)
	case class MitigationConfig(key: String, value: String)

	errorReader.getBreakingChangeInfo(errorClass).foreach{ breakingChangeInfo =>
	errorReader.getBreakingChangeInfo(errorClass).foreach { breakingChangeInfo =>

[SPARK-53507][CONNECT] Add breaking change info to errors #52256

[SPARK-53507][CONNECT] Add breaking change info to errors #52256

Uh oh!

Conversation

imarkowitz commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan left a comment

Choose a reason for hiding this comment

Uh oh!

imarkowitz commented Sep 17, 2025

Uh oh!

HyukjinKwon commented Sep 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

imarkowitz commented Sep 5, 2025 •

edited

Loading