-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40540][SQL] Migrate compilation errors onto error classes #37973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cloud-fan @itholic @srielau @anchovYu @gatorsmile Could you review this PR, please. |
| "<msg>." | ||
| ] | ||
| }, | ||
| "_LEGACY_ERROR_TEMP_1000" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering why we choose use numbers than a descriptive way for the names?
E.g. _LEGACY_ERROR_TEMP_1000 => _LEGACY_ERROR_TEMP_LEGACY_STORE_ASSIGNMENT_POLICY_DISALLOWED?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it will cause too much work for the migration. If this does not bring much values please feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering why we choose use numbers ...
@amaliujia Please, see in PR's description:
Why are the changes needed?
The migration on temporary error classes allows to gather statistics about errors and detect most popular error classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amaliujia Also see #37916 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the context! I see it better now.
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
Show resolved
Hide resolved
|
Merging to master. Thank you, @cloud-fan for review. |
…ffer for 'Struct Star Expansion' test ### What changes were proposed in this pull request? This PR is a minor followup of #37973 to match the pattern of either `ArrayBuffer` (Scala 2.12) or `List` (Scala 2.13) in "Struct Star Expansion" test case at `SQLQuerySuite.scala`. ### Why are the changes needed? Currently, the test fails with Scala 2.13 (https://github.com/apache/spark/actions/runs/3146198025/jobs/5114388079): ``` [info] - Struct Star Expansion *** FAILED *** (1 second, 801 milliseconds) [info] Map("attributes" -> "List(a)") did not equal Map("attributes" -> "ArrayBuffer(a)") (SparkFunSuite.scala:328) [info] Analysis: [info] JavaCollectionWrappers$JMapWrapper(attributes: List(a) -> ArrayBuffer(a)) [info] org.scalatest.exceptions.TestFailedException: [info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472) [info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471) [info] at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295) [info] at org.apache.spark.SparkFunSuite.checkError(SparkFunSuite.scala:328) [info] at org.apache.spark.SparkFunSuite.checkError(SparkFunSuite.scala:369) ``` We should also consider Scala 2.13 case. ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? Manually tested: ```scala scala> "List(a)".matches("(ArrayBuffer|List)\\(a\\)") res0: Boolean = true scala> "ArrayBuffer(a)".matches("(ArrayBuffer|List)\\(a\\)") res1: Boolean = true ``` Closes #38045 from HyukjinKwon/SPARK-40540-followup. Authored-by: Hyukjin Kwon <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
What changes were proposed in this pull request?
In the PR, I propose to migrate 100 compilation errors onto temporary error classes with the prefix
_LEGACY_ERROR_TEMP_10xx. The error message will not include the error classes, so, in this way we will preserve the existing behaviour.Why are the changes needed?
The migration on temporary error classes allows to gather statistics about errors and detect most popular error classes. After that we could prioritise the work on migration.
The new error class name prefix
_LEGACY_ERROR_TEMP_proposed here kind of marks the error as developer-facing, not user-facing. Developers can still get the error class programmatically via theSparkThrowableinterface, so that they can build error infra with it. End users won't see the error class in the message. This allows us to do the error migration very quickly, and we can refine the error classes and mark them as user-facing later (naming them properly, adding tests, etc.).Does this PR introduce any user-facing change?
No. The error messages should be almost the same by default.
How was this patch tested?
By running the modified test suites: