Skip to content

Conversation

@dgd-contributor
Copy link

What changes were proposed in this pull request?

Adds error classes to some of the exceptions in QueryExecutionErrors.

Why are the changes needed?

Improves auditing for developers and adds useful fields for users (error class and SQLSTATE).

Does this PR introduce any user-facing change?

Fills in missing error class and SQLSTATE fields.

How was this patch tested?

Existing tests

@HyukjinKwon HyukjinKwon changed the title [SPARK-36302][SQL] :Refactor thirteenth set of 20 query execution errors to use error classes [SPARK-36302][SQL] Refactor thirteenth set of 20 query execution errors to use error classes Aug 23, 2021
@HyukjinKwon
Copy link
Member

cc @karenfeng FYI

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@Peng-Lei
Copy link
Contributor

Please see #PR. I try to define the new exception to mix SparkThrowable into a base Exception type. It can help us reduce conflicts in parallel development

Copy link
Contributor

@karenfeng karenfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @dgd-contributor! I left some comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this error message is meant to be a single string without any newlines. Given how complex this is, however, we can turn this into an array of strings which will be joined with newlines for the user. In this case, it would look like:

"message" : [
  "Caught Hive MetaException attempting to get partition metadata by filter from Hive.",
  "You can set the Spark configuration setting %s to true to work around this problem, however this will result in degraded performance.",
  "Please report a bug: https://issues.apache.org/jira/browse/SPARK" ]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about 0D000; would 42000 be a better fit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is an error class, you can remove the _ERROR suffix to reduce redundancy and clutter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0A000 may fit better here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0A000 may be a better fit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also split this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be 0A000

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is 42000; maybe we can leave this empty?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you meant to use the RENAME_PATH_AS_EXISTS_PATH_ERROR error class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make much sense grammatically; maybe RENAME_OVERWRITES_EXISTING_PATH? We can also consolidate this with RENAME_PATH_AS_EXISTS_PATH_ERROR, and have everyone throw this error class.

@dgd-contributor dgd-contributor force-pushed the SPARK-36302 branch 2 times, most recently from c3de099 to 0a4b38e Compare September 24, 2021 04:37
@dgd-contributor
Copy link
Author

CC @karenfeng. Could you take another look when you have time ? Thanks!

Copy link
Contributor

@karenfeng karenfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @dgd-contributor! I left some more comments.

{
"ALTER_TABLE_WITH_DROP_PARTITION_AND_PURGE_UNSUPPORTED" : {
"message" : [ "ALTER TABLE ... DROP PARTITION ... PURGE" ],
"sqlState" : "42000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0A000 may be a better fit for these unsupported-type errors

"CANNOT_EVALUATE_EXPRESSION" : {
"message" : [ "Cannot evaluate expression: %s" ]
},
"CANNOT_FETCH_TABLES_OF_DATABASE" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify this: CANNOT_FETCH_TABLES_OF_DATABASE -> CANNOT_FETCH DATABASE_TABLES

},
"CANNOT_FETCH_TABLES_OF_DATABASE" : {
"message" : [ "Unable to fetch tables of db %s" ],
"sqlState" : "42000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a catch-all type exception. I wouldn't group this as a syntax/semantic error (42000); maybe we can leave the SQLSTATE field empty.

"CONCURRENT_QUERY" : {
"message" : [ "Another instance of this query was just started by a concurrent session." ]
},
"CONVERT_HIVE_TABLE_TO_CATALOG_TABLE" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONVERT_HIVE_TABLE_TO_CATALOG_TABLE does not capture the error here; CANNOT_CONVERT_HIVE_TABLE_TO_CATALOG_TABLE would better encompass it

},
"CONVERT_HIVE_TABLE_TO_CATALOG_TABLE" : {
"message" : [ "%s, db: %s, table: %s" ],
"sqlState" : "0A000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an unsupported operation issue; I'd leave this empty

},
"FAILED_RENAME_TEMP_FILE" : {
"message" : [ "Failed to rename temp file %s to %s as rename returned false" ],
"sqlState" : "42000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessarily a syntax error. I don't have a good idea for which SQLSTATE this should be though. I'd leave this empty.

"LEGACY_METADATA_PATH_EXISTS" : {
"message" : [ "Error: we detected a possible problem with the location of your \"_spark_metadata\" directory and you likely need to move it before restarting this query.", "Earlier version of Spark incorrectly escaped paths when writing out the \"_spark_metadata\" directory for structured streaming.", "While this was corrected in Spark 3.0, it appears that your query was started using an earlier version that incorrectly handled the \"_spark_metadata\" path.", "Correct \"_spark_metadata\" Directory: %s. Incorrect \"_spark_metadata\" Directory: %s", "Please move the data from the incorrect directory to the correct one, delete the incorrect directory, and then restart this query.", "If you believe you are receiving this message in error, you can disable it with the SQL conf %s." ]
},
"LOAD_HIVE_CLIENT_CAUSES_NO_CLASS_DEF_FOUND" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEF -> DEFINITION

},
"LOAD_HIVE_CLIENT_CAUSES_NO_CLASS_DEF_FOUND" : {
"message" : [ "%s when creating Hive client using classpath: %s", "Please make sure that jars for your version of hive and hadoop are included in the paths passed to %s." ],
"sqlState" : "42000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is syntax/semantic error either. I'd remove this.

"message" : [ "PARTITION clause cannot contain a non-partition column name: %s" ],
"sqlState" : "42000"
},
"PARTITION_COLUMN_NOT_FOUND_IN_SCHEMA" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to match MISSING_COLUMN: MISSING_PARTITION_COLUMN

"sqlState" : "0A000"
},
"UNSUPPORTED_HIVE_METASTORE_VERSION" : {
"message" : [ "Unsupported Hive Metastore version (%s). Please set %s with a valid version." ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a 0A000 SQLSTATE.

@dgd-contributor
Copy link
Author

Thanks for your suggestion, @karenfeng, I have updated this PR. Please take another look when you have time, thanks!

Copy link
Contributor

@karenfeng karenfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall! Thanks @dgd-contributor; I left one small nit to address.

"FAILED_SET_ORIGINAL_PERMISSION_BACK" : {
"message" : [ "Failed to set original permission %s back to the created path: %s. Exception: %s" ]
},
"GET_PARTITION_METADATA_BY_FILTER" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of this isn't obviously an error. Can we name this something like CANNOT_GET_PARTITION_METADATA_BY_FILTER?

@dgd-contributor
Copy link
Author

Thanks for all your suggestions in recent our PRs, @karenfeng. dgd-contributes is a team, so it may be inconvenient for you when you help a member to resolve this problem and other members don't know anything about it. You can see the same problem in another PR so we are sorry about it. Now, a member of dgd team will work on all issues involved in refactoring query execution errors to use error classes. We will close all recent PR and a member will make PRs for those issues, so when you have time, please help us again.
If you get uncomfortable when helping us, we are sorry again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants