Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Sep 22, 2022

What changes were proposed in this pull request?

In the PR, I propose to migrate 100 compilation errors onto temporary error classes with the prefix _LEGACY_ERROR_TEMP_10xx. The error message will not include the error classes, so, in this way we will preserve the existing behaviour.

Why are the changes needed?

The migration on temporary error classes allows to gather statistics about errors and detect most popular error classes. After that we could prioritise the work on migration.

The new error class name prefix _LEGACY_ERROR_TEMP_ proposed here kind of marks the error as developer-facing, not user-facing. Developers can still get the error class programmatically via the SparkThrowable interface, so that they can build error infra with it. End users won't see the error class in the message. This allows us to do the error migration very quickly, and we can refine the error classes and mark them as user-facing later (naming them properly, adding tests, etc.).

Does this PR introduce any user-facing change?

No. The error messages should be almost the same by default.

How was this patch tested?

By running the modified test suites:

$ build/sbt "core/testOnly *SparkThrowableSuite"
$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "test:testOnly *AnalysisErrorSuite"
$ build/sbt "test:testOnly *.HiveUDAFSuite"
$ build/sbt "test:testOnly *.DataFrameSuite"
$ build/sbt "test:testOnly *SQLQuerySuite"

@MaxGekk MaxGekk changed the title [WIP][SPARK-40540][SQL] Migrate compilation errors onto error classes [SPARK-40540][SQL] Migrate compilation errors onto error classes Sep 23, 2022
@MaxGekk MaxGekk marked this pull request as ready for review September 23, 2022 13:55
@MaxGekk MaxGekk requested a review from cloud-fan September 23, 2022 13:55
@MaxGekk
Copy link
Member Author

MaxGekk commented Sep 23, 2022

@cloud-fan @itholic @srielau @anchovYu @gatorsmile Could you review this PR, please.

"<msg>."
]
},
"_LEGACY_ERROR_TEMP_1000" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering why we choose use numbers than a descriptive way for the names?

E.g. _LEGACY_ERROR_TEMP_1000 => _LEGACY_ERROR_TEMP_LEGACY_STORE_ASSIGNMENT_POLICY_DISALLOWED?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it will cause too much work for the migration. If this does not bring much values please feel free to ignore.

Copy link
Member Author

@MaxGekk MaxGekk Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering why we choose use numbers ...

@amaliujia Please, see in PR's description:

Why are the changes needed?

The migration on temporary error classes allows to gather statistics about errors and detect most popular error classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context! I see it better now.

@MaxGekk MaxGekk requested a review from gatorsmile September 24, 2022 16:06
@MaxGekk
Copy link
Member Author

MaxGekk commented Sep 26, 2022

Merging to master. Thank you, @cloud-fan for review.

@MaxGekk MaxGekk closed this in 39f6057 Sep 26, 2022
HyukjinKwon added a commit that referenced this pull request Sep 29, 2022
…ffer for 'Struct Star Expansion' test

### What changes were proposed in this pull request?

This PR is a minor followup of #37973 to match the pattern of either `ArrayBuffer` (Scala 2.12) or `List` (Scala 2.13) in "Struct Star Expansion" test case at `SQLQuerySuite.scala`.

### Why are the changes needed?

Currently, the test fails with Scala 2.13 (https://github.com/apache/spark/actions/runs/3146198025/jobs/5114388079):

```
[info] - Struct Star Expansion *** FAILED *** (1 second, 801 milliseconds)
[info]   Map("attributes" -> "List(a)") did not equal Map("attributes" -> "ArrayBuffer(a)") (SparkFunSuite.scala:328)
[info]   Analysis:
[info]   JavaCollectionWrappers$JMapWrapper(attributes: List(a) -> ArrayBuffer(a))
[info]   org.scalatest.exceptions.TestFailedException:
[info]   at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
[info]   at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
[info]   at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
[info]   at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
[info]   at org.apache.spark.SparkFunSuite.checkError(SparkFunSuite.scala:328)
[info]   at org.apache.spark.SparkFunSuite.checkError(SparkFunSuite.scala:369)
```

We should also consider Scala 2.13 case.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Manually tested:

```scala
scala> "List(a)".matches("(ArrayBuffer|List)\\(a\\)")
res0: Boolean = true

scala> "ArrayBuffer(a)".matches("(ArrayBuffer|List)\\(a\\)")
res1: Boolean = true
```

Closes #38045 from HyukjinKwon/SPARK-40540-followup.

Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants