Skip to content

Conversation

@chenhao-db
Copy link
Contributor

What changes were proposed in this pull request?

This is a cherry-pick of #47796.

The xpath expression incorrectly marks its return type as array of non-null strings. However, it can actually return an array containing nulls. This can cause NPE in code generation, such as query select coalesce(xpath(repeat('<a></a>', id), 'a')[0], '') from range(1, 2).

Why are the changes needed?

It avoids potential failures in queries that uses the xpath expression.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

A new unit test. It would fail without the change in the PR.

Was this patch authored or co-authored using generative AI tooling?

No.

The `xpath` expression incorrectly marks its return type as array of non-null strings. However, it can actually return an array containing nulls. This can cause NPE in code generation, such as query `select coalesce(xpath(repeat('<a></a>', id), 'a')[0], '') from range(1, 2)`.

It avoids potential failures in queries that uses the `xpath` expression.

No.

A new unit test. It would fail without the change in the PR.

No.

Closes apache#47796 from chenhao-db/fix_xpath_nullness.

Authored-by: Chenhao Li <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
@github-actions github-actions bot added the SQL label Sep 2, 2024
@chenhao-db
Copy link
Contributor Author

@MaxGekk here is the PR for 3.5.

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenhao-db Could you fix the test failure:

- function_base64 *** FAILED *** (6 milliseconds)
[info]   Expected and actual plans do not match:
[info]   
[info]   === Expected Plan ===
[info]   Project [staticinvoke(class org.apache.spark.sql.catalyst.expressions.Base64, StringType, encode, cast(g#0 as binary), true, BinaryType, BooleanType, true, true, true) AS base64(g)#0]
[info]   +- LocalRelation <empty>, [id#0L, a#0, b#0, d#0, e#0, f#0, g#0]
[info]   
[info]   
[info]   === Actual Plan ===
[info]   Project [staticinvoke(class org.apache.spark.sql.catalyst.expressions.Base64, StringType, encode, cast(g#0 as binary), true, BinaryType, BooleanType, true, false, true) AS base64(g)#0]

@chenhao-db
Copy link
Contributor Author

@MaxGekk Done. There is actually nothing wrong with my code, but branch-3.5 was broken and now fixed.

@MaxGekk
Copy link
Member

MaxGekk commented Sep 4, 2024

+1, LGTM. Merging to 3.5.
Thank you, @chenhao-db.

MaxGekk pushed a commit that referenced this pull request Sep 4, 2024
### What changes were proposed in this pull request?

This is a cherry-pick of #47796.

The `xpath` expression incorrectly marks its return type as array of non-null strings. However, it can actually return an array containing nulls. This can cause NPE in code generation, such as query `select coalesce(xpath(repeat('<a></a>', id), 'a')[0], '') from range(1, 2)`.

### Why are the changes needed?

It avoids potential failures in queries that uses the `xpath` expression.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

A new unit test. It would fail without the change in the PR.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47959 from chenhao-db/fix_xpath_nullness_3.5.

Authored-by: Chenhao Li <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
@MaxGekk MaxGekk closed this Sep 4, 2024
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…ion (apache#556)

### What changes were proposed in this pull request?

This is a cherry-pick of apache#47796.

The `xpath` expression incorrectly marks its return type as array of non-null strings. However, it can actually return an array containing nulls. This can cause NPE in code generation, such as query `select coalesce(xpath(repeat('<a></a>', id), 'a')[0], '') from range(1, 2)`.

### Why are the changes needed?

It avoids potential failures in queries that uses the `xpath` expression.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

A new unit test. It would fail without the change in the PR.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#47959 from chenhao-db/fix_xpath_nullness_3.5.

Authored-by: Chenhao Li <[email protected]>

Signed-off-by: Max Gekk <[email protected]>
Co-authored-by: Chenhao Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants