Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to upgrade PyArrow version in 3.5 client <> master server build.

Why are the changes needed?

Server requires higher PyArrow version.

Does this PR introduce any user-facing change?

No, test-only.

How was this patch tested?

Will monitor the CI.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the INFRA label Jul 28, 2025
@HyukjinKwon
Copy link
Member Author

Merged to master.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Nov 23, 2025

Hi, @HyukjinKwon and @zhengruifeng .

Unfortunately, according to the CI monitoring result, there is no success with this PR ('pyarrow==12.0.1' => 'pyarrow>=18.0.0').

Screenshot 2025-11-22 at 19 45 29

It's because PySpark 3.5.x client requires 4.0.0<=PyArrow<13.0.0.

`pyarrow` >=4.0.0,<13.0.0 Required for pandas API on Spark and Spark Connect; Optional for Spark SQL

The root cause is the breaking change of PyArrow 13.0.0.

@dongjoon-hyun
Copy link
Member

Due to the long gap between Apache Spark 3.5.0 (2023-09-13) and 4.1.x (Now), it seems that we cannot find a common dependency in a single Python installation. We need to have alternative CIs to recover the CIs.

@dongjoon-hyun
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants