-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-36302][SQL] Refactor thirteenth set of 20 query execution errors to use error classes #33805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @karenfeng FYI |
|
Can one of the admins verify this patch? |
|
Please see #PR. I try to define the new exception to mix SparkThrowable into a base Exception type. It can help us reduce conflicts in parallel development |
karenfeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @dgd-contributor! I left some comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this error message is meant to be a single string without any newlines. Given how complex this is, however, we can turn this into an array of strings which will be joined with newlines for the user. In this case, it would look like:
"message" : [
"Caught Hive MetaException attempting to get partition metadata by filter from Hive.",
"You can set the Spark configuration setting %s to true to work around this problem, however this will result in degraded performance.",
"Please report a bug: https://issues.apache.org/jira/browse/SPARK" ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about 0D000; would 42000 be a better fit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is an error class, you can remove the _ERROR suffix to reduce redundancy and clutter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0A000 may fit better here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0A000 may be a better fit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also split this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be 0A000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is 42000; maybe we can leave this empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you meant to use the RENAME_PATH_AS_EXISTS_PATH_ERROR error class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make much sense grammatically; maybe RENAME_OVERWRITES_EXISTING_PATH? We can also consolidate this with RENAME_PATH_AS_EXISTS_PATH_ERROR, and have everyone throw this error class.
c3de099 to
0a4b38e
Compare
…rs to use error classes
|
CC @karenfeng. Could you take another look when you have time ? Thanks! |
karenfeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @dgd-contributor! I left some more comments.
| { | ||
| "ALTER_TABLE_WITH_DROP_PARTITION_AND_PURGE_UNSUPPORTED" : { | ||
| "message" : [ "ALTER TABLE ... DROP PARTITION ... PURGE" ], | ||
| "sqlState" : "42000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0A000 may be a better fit for these unsupported-type errors
| "CANNOT_EVALUATE_EXPRESSION" : { | ||
| "message" : [ "Cannot evaluate expression: %s" ] | ||
| }, | ||
| "CANNOT_FETCH_TABLES_OF_DATABASE" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify this: CANNOT_FETCH_TABLES_OF_DATABASE -> CANNOT_FETCH DATABASE_TABLES
| }, | ||
| "CANNOT_FETCH_TABLES_OF_DATABASE" : { | ||
| "message" : [ "Unable to fetch tables of db %s" ], | ||
| "sqlState" : "42000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a catch-all type exception. I wouldn't group this as a syntax/semantic error (42000); maybe we can leave the SQLSTATE field empty.
| "CONCURRENT_QUERY" : { | ||
| "message" : [ "Another instance of this query was just started by a concurrent session." ] | ||
| }, | ||
| "CONVERT_HIVE_TABLE_TO_CATALOG_TABLE" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CONVERT_HIVE_TABLE_TO_CATALOG_TABLE does not capture the error here; CANNOT_CONVERT_HIVE_TABLE_TO_CATALOG_TABLE would better encompass it
| }, | ||
| "CONVERT_HIVE_TABLE_TO_CATALOG_TABLE" : { | ||
| "message" : [ "%s, db: %s, table: %s" ], | ||
| "sqlState" : "0A000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't an unsupported operation issue; I'd leave this empty
| }, | ||
| "FAILED_RENAME_TEMP_FILE" : { | ||
| "message" : [ "Failed to rename temp file %s to %s as rename returned false" ], | ||
| "sqlState" : "42000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessarily a syntax error. I don't have a good idea for which SQLSTATE this should be though. I'd leave this empty.
| "LEGACY_METADATA_PATH_EXISTS" : { | ||
| "message" : [ "Error: we detected a possible problem with the location of your \"_spark_metadata\" directory and you likely need to move it before restarting this query.", "Earlier version of Spark incorrectly escaped paths when writing out the \"_spark_metadata\" directory for structured streaming.", "While this was corrected in Spark 3.0, it appears that your query was started using an earlier version that incorrectly handled the \"_spark_metadata\" path.", "Correct \"_spark_metadata\" Directory: %s. Incorrect \"_spark_metadata\" Directory: %s", "Please move the data from the incorrect directory to the correct one, delete the incorrect directory, and then restart this query.", "If you believe you are receiving this message in error, you can disable it with the SQL conf %s." ] | ||
| }, | ||
| "LOAD_HIVE_CLIENT_CAUSES_NO_CLASS_DEF_FOUND" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DEF -> DEFINITION
| }, | ||
| "LOAD_HIVE_CLIENT_CAUSES_NO_CLASS_DEF_FOUND" : { | ||
| "message" : [ "%s when creating Hive client using classpath: %s", "Please make sure that jars for your version of hive and hadoop are included in the paths passed to %s." ], | ||
| "sqlState" : "42000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is syntax/semantic error either. I'd remove this.
| "message" : [ "PARTITION clause cannot contain a non-partition column name: %s" ], | ||
| "sqlState" : "42000" | ||
| }, | ||
| "PARTITION_COLUMN_NOT_FOUND_IN_SCHEMA" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename to match MISSING_COLUMN: MISSING_PARTITION_COLUMN
| "sqlState" : "0A000" | ||
| }, | ||
| "UNSUPPORTED_HIVE_METASTORE_VERSION" : { | ||
| "message" : [ "Unsupported Hive Metastore version (%s). Please set %s with a valid version." ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a 0A000 SQLSTATE.
|
Thanks for your suggestion, @karenfeng, I have updated this PR. Please take another look when you have time, thanks! |
karenfeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall! Thanks @dgd-contributor; I left one small nit to address.
| "FAILED_SET_ORIGINAL_PERMISSION_BACK" : { | ||
| "message" : [ "Failed to set original permission %s back to the created path: %s. Exception: %s" ] | ||
| }, | ||
| "GET_PARTITION_METADATA_BY_FILTER" : { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming of this isn't obviously an error. Can we name this something like CANNOT_GET_PARTITION_METADATA_BY_FILTER?
|
Thanks for all your suggestions in recent our PRs, @karenfeng. dgd-contributes is a team, so it may be inconvenient for you when you help a member to resolve this problem and other members don't know anything about it. You can see the same problem in another PR so we are sorry about it. Now, a member of dgd team will work on all issues involved in refactoring query execution errors to use error classes. We will close all recent PR and a member will make PRs for those issues, so when you have time, please help us again. |
What changes were proposed in this pull request?
Adds error classes to some of the exceptions in QueryExecutionErrors.Why are the changes needed?
Improves auditing for developers and adds useful fields for users (error class and SQLSTATE).Does this PR introduce any user-facing change?
Fills in missing error class and SQLSTATE fields.How was this patch tested?
Existing tests