-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-39783][SQL] Quote qualifiedName to fix backticks for column candidates in error messages #38256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-39783][SQL] Quote qualifiedName to fix backticks for column candidates in error messages #38256
Conversation
|
I am not sure if this is the best place to fix this, but since |
|
Note that raise the exception. |
|
Note that |
|
I was already working on this ticket and opened a PR for #38254. I would prefer to continue with that change if you don't mind. |
|
@sadikovi I am happy to contribute the tests to your PR. |
|
It is fine, let's work on your PR, it is more complete. I was in process of adding the tests but I noticed you opened another PR. |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala
Show resolved
Hide resolved
| "[`a`, `b`, `c`, `d`, `e`]" | ||
| :: Nil) | ||
|
|
||
| errorTest( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you invoke errorClassTest(). This will allow to make the test independent from error message text, so, tech editors could edit error-classes.json and don't depend on Spark's tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| val errorMsg = intercept[AnalysisException] { | ||
| // Note: ds(colName) has different semantics than ds.select(colName) | ||
| ds.select(colName) | ||
| } | ||
| assert(errorMsg.getMessage.contains( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, use checkError().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
0b56eb4 to
99f0a06
Compare
|
I have also managed to add tests for error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find a way to test this though. Suggestions welcome!
99f0a06 to
b9c9907
Compare
|
Can one of the admins verify this patch? |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/namedExpressions.scala
Show resolved
Hide resolved
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with it, if no further comment from Max
|
late LGTM |
…ndidates in error messages
### What changes were proposed in this pull request?
The `NamedExpression.qualifiedName` is a concatenation of qualifiers and the name, joined by `dots`. If those contain `dots`, the result `qualifiedName` is ambiguous. Quoting those if they contain `dots` fixes this, while this also fixes quoting column candidates in the error messages `UNRESOLVED_COLUMN` and `UNRESOLVED_MAP_KEY`:
`UNRESOLVED_COLUMN`:
```
Seq((0)).toDF("the.id").select("the.id").show()
```
The error message should read
org.apache.spark.sql.AnalysisException: [UNRESOLVED_COLUMN] A column or function parameter
with name `the`.`id` cannot be resolved. Did you mean one of the following? [`the.id`];
while it was:
org.apache.spark.sql.AnalysisException: [UNRESOLVED_COLUMN] A column or function parameter
with name `the`.`id` cannot be resolved. Did you mean one of the following? [`the`.`id`];
`UNRESOLVED_MAP_KEY`:
```
Seq((0)).toDF("id")
.select(map(lit("key"), lit(1)).as("map"), lit(2).as("other.column"))
.select($"`map`"($"nonexisting")).show()
```
The error message should read
Cannot resolve column `nonexisting` as a map key. If the key is a string literal, please add single quotes around it.
Otherwise did you mean one of the following column(s)? [`map`, `other.column`];
while it was:
Cannot resolve column `nonexisting` as a map key. If the key is a string literal, please add single quotes around it.
Otherwise did you mean one of the following column(s)? [`map`, `other`.`column`];
### Why are the changes needed?
The current quoting is wrong and `qualifiedName` is ambiguous if `name` or `qualifiers` contain `dots`.
### Does this PR introduce _any_ user-facing change?
It corrects the error message.
### How was this patch tested?
This is tested in `AnalysisErrorSuite`, `DatasetSuite` and `QueryCompilationErrorsSuite.scala`.
Closes apache#38256 from EnricoMi/branch-correct-backticks-error-message.
Authored-by: Enrico Minack <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
What changes were proposed in this pull request?
The
NamedExpression.qualifiedNameis a concatenation of qualifiers and the name, joined bydots. If those containdots, the resultqualifiedNameis ambiguous. Quoting those if they containdotsfixes this, while this also fixes quoting column candidates in the error messagesUNRESOLVED_COLUMNandUNRESOLVED_MAP_KEY:UNRESOLVED_COLUMN:The error message should read
while it was:
UNRESOLVED_MAP_KEY:The error message should read
while it was:
Why are the changes needed?
The current quoting is wrong and
qualifiedNameis ambiguous ifnameorqualifierscontaindots.Does this PR introduce any user-facing change?
It corrects the error message.
How was this patch tested?
This is tested in
AnalysisErrorSuite,DatasetSuiteandQueryCompilationErrorsSuite.scala.