-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-46519][SQL] Clear unused error classes from error-classes.json file
#44503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
How to find During the search process above, some unused error classes were also discovered, as follows:
These error classes did not appear in the |
| /** | ||
| * Object for grouping error messages from (most) exceptions thrown during query execution. | ||
| * This does not include exceptions thrown during the eager execution of commands, which are | ||
| * grouped into [[QueryCompilationErrors]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the sql/api module, QueryCompilationErrors has been renamed to CompilationErrors`, so we will fix it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have both, AFAIK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the perspective of module dependency, it seems to express the meaning of CompilationErrors in the sql/api module, instead of QueryCompilationErrors in the sql/catalyst module.
Or we can write it as: [[CompilationErrors]] and [[QueryCompilationErrors]] ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's leave the ref to CompilationErrors only.
|
cc @MaxGekk |
|
@MaxGekk 1._LEGACY_ERROR_TEMP_3066 spark/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalogSuite.scala Line 572 in 5a5d3c7
2._LEGACY_ERROR_TEMP_3078 spark/sql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala Line 163 in 5a5d3c7
Line 2349 in 5a5d3c7
|
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These error classes did not appear in the Spark code repo, but they are temporarily retained as they are internal errors.
FYI, we have a conception of category for internal error, see https://github.com/apache/spark/pull/40978/files#diff-41229d3f8af21d2eb979e6d8c5b52058b1c460508f1786a6efa8dd6dbbc8c218R79. We form the names dynamically based on the caller side. This should allow to quickly identify from which sub-system the error comes from and who is responsible for it (to handle it).
If it is possible to delete them while preserving test logic, let's delete them. |
| case _ => | ||
| throw new AnalysisException( | ||
| errorClass = "_LEGACY_ERROR_TEMP_3078", messageParameters = Map.empty) | ||
| case _ => assert(false, "Can not match ParquetTable in the query.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to this:
spark/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
Line 73 in b106f80
| case _ => assert(false, "Can not match OrcTable in the query.") |
| throw new AnalysisException( | ||
| errorClass = "_LEGACY_ERROR_TEMP_3066", | ||
| messageParameters = Map("msg" -> ex.getMessage)) | ||
| errorClass = "_LEGACY_ERROR_TEMP_2193", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to this:
spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala
Lines 383 to 384 in b106f80
| case ex: MetaException => | |
| throw QueryExecutionErrors.getPartitionMetadataByFilterError(ex) |
spark/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
Lines 1644 to 1651 in b106f80
| def getPartitionMetadataByFilterError(e: Exception): SparkRuntimeException = { | |
| new SparkRuntimeException( | |
| errorClass = "_LEGACY_ERROR_TEMP_2193", | |
| messageParameters = Map( | |
| "hiveMetastorePartitionPruningFallbackOnException" -> | |
| SQLConf.HIVE_METASTORE_PARTITION_PRUNING_FALLBACK_ON_EXCEPTION.key), | |
| cause = e) | |
| } |
| // Throw an AnalysisException - this should be captured. | ||
| spark.experimental.extraStrategies = Seq[SparkStrategy]( | ||
| (_: LogicalPlan) => throw new AnalysisException("_LEGACY_ERROR_TEMP_3078", Map.empty)) | ||
| (_: LogicalPlan) => throw new AnalysisException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UT can actually handle any type of AnalysisException. I have chosen an error class (UNSUPPORTED_DATASOURCE_FOR_DIRECT_QUERY) that is closer to semantics to replace it.
| }, | ||
| "sqlState" : "42K0B" | ||
| }, | ||
| "INCORRECT_END_OFFSET" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reasons there is still a reference to the error class:
$ find . -type f -print0|xargs -0 grep INCORRECT_END_OFFSET
./docs/sql-error-conditions-sqlstates.md: <td><a href="arithmetic-overflow-error-class.md">ARITHMETIC_OVERFLOW</a>, <a href="sql-error-conditions.html#cast_overflow">CAST_OVERFLOW</a>, <a href="sql-error-conditions.html#cast_overflow_in_table_insert">CAST_OVERFLOW_IN_TABLE_INSERT</a>, <a href="sql-error-conditions.html#decimal_precision_exceeds_max_precision">DECIMAL_PRECISION_EXCEEDS_MAX_PRECISION</a>, <a href="sql-error-conditions.html#invalid_index_of_zero">INVALID_INDEX_OF_ZERO</a>, <a href="sql-error-conditions.html#incorrect_end_offset">INCORRECT_END_OFFSET</a>, <a href="sql-error-conditions.html#incorrect_ramp_up_rate">INCORRECT_RAMP_UP_RATE</a>, <a href="invalid-array-index-error-class.md">INVALID_ARRAY_INDEX</a>, <a href="invalid-array-index-in-element-at-error-class.md">INVALID_ARRAY_INDEX_IN_ELEMENT_AT</a>, <a href="sql-error-conditions.html#numeric_out_of_supported_range">NUMERIC_OUT_OF_SUPPORTED_RANGE</a>, <a href="sql-error-conditions.html#numeric_value_out_of_range">NUMERIC_VALUE_OUT_OF_RANGE</a>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove the reference to the deleted error class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me delete it first, and then I'll try to investigate the root cause again to see if we can automate its discovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| /** | ||
| * Object for grouping error messages from (most) exceptions thrown during query execution. | ||
| * This does not include exceptions thrown during the eager execution of commands, which are | ||
| * grouped into [[QueryCompilationErrors]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's leave the ref to CompilationErrors only.
| }, | ||
| "sqlState" : "42K0B" | ||
| }, | ||
| "INCORRECT_END_OFFSET" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove the reference to the deleted error class.
|
+1, LGTM. Merging to master. |
What changes were proposed in this pull request?
The pr aims to:
error-classes.json.dataSourceAlreadyExistsinQueryCompilationErrors.scalaWhy are the changes needed?
Make code clear.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
No.