Skip to content

Conversation

@phanhuyn
Copy link
Contributor

@phanhuyn phanhuyn commented Dec 22, 2023

Hi, thanks for checking the PR. This is a small bug fix to make Scala Spark works with Clickhouse's array type. Let me know if this could cause problem on other DB types.

(Please help to trigger CI if possible. I failed to make the build pipeline run - any help is appreciated)

Why are the changes needed?

The PR is to fix issue describe at: ClickHouse/clickhouse-java#1505

When using spark to write an array of string to Clickhouse, the Clickhouse JDBC driver throws java.lang.IllegalArgumentException: Unknown data type: string exception.

The exception was due to Spark JDBC utils passing an invalid type value string (should be String). The original type values retrieved from Clickhouse JDBC is correct, but Spark JDBC utils attempts to convert to type string to lower case:

What changes were proposed in this pull request?

  • Remove the lowercase cast. The string value retrieved from JDBC driver implementation should be passed as-is, Spark shouldn't try to modify the value.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Dec 22, 2023
@phanhuyn phanhuyn marked this pull request as draft December 22, 2023 08:21
@phanhuyn phanhuyn marked this pull request as ready for review December 22, 2023 08:21
@HyukjinKwon
Copy link
Member

Thanks for the PR. Mind creating a JIRA please? (see also https://spark.apache.org/contributing.html).

@phanhuyn
Copy link
Contributor Author

phanhuyn commented Jan 7, 2024

Thanks for the PR. Mind creating a JIRA please? (see also https://spark.apache.org/contributing.html).

Thanks for the reply @HyukjinKwon. I've applied for a JIRA account at https://selfserve.apache.org/jira-account.html, will be creating the story when the account is approved.

Update: JIRA is created https://issues.apache.org/jira/browse/SPARK-46612

@phanhuyn phanhuyn changed the title Do not convert array type string retrieved from jdbc driver [SPARK-46612] Do not convert array type string retrieved from jdbc driver Jan 7, 2024
@phanhuyn phanhuyn changed the title [SPARK-46612] Do not convert array type string retrieved from jdbc driver [SPARK-46612][SQL] Do not convert array type string retrieved from jdbc driver Jan 7, 2024
@yaooqinn
Copy link
Member

Hi @phanhuyn, Could you enable the GitHub action?

@phanhuyn phanhuyn force-pushed the do-not-lower-case-jdbc-array-type branch from c46a8fc to c26ff83 Compare January 15, 2024 00:26
@phanhuyn
Copy link
Contributor Author

Hi @phanhuyn, Could you enable the GitHub action?

I enabled GitHub action and triggered a few builds. It's failing at "Run documentation build" step. I'm not sure why this change would cause the documentation build to fail, could you help to take a look? Thank you.

The action url: https://github.com/phanhuyn/spark/actions/runs/7522708428/job/20479020431

Screenshot 2024-01-16 at 4 34 24 PM

@yaooqinn
Copy link
Member

This build error shall be fixed, you can rebase the master branch and try again.

Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks reasonable to me to respect the underlying database-specific type definition

@yaooqinn yaooqinn closed this in 931eb93 Jan 17, 2024
@yaooqinn
Copy link
Member

Thank you @phanhuyn. Merged to master

@phanhuyn
Copy link
Contributor Author

thanks @yaooqinn

@yaooqinn
Copy link
Member

Oops, I didn't pay much attention to the authorship resolution step in the merge script, the Lead-authored-by: shall be Nguyen Phan Huy

@phanhuyn
Copy link
Contributor Author

Oops, I didn't pay much attention to the authorship resolution step in the merge script, the Lead-authored-by: shall be Nguyen Phan Huy

All good, no problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants