Skip to content

Conversation

@HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Sep 15, 2022

What changes were proposed in this pull request?

This PR introduces GroupStateImpl and GroupStateTimeout in PySpark, and updates Scala codebase to support convenient conversion between PySpark implementation and Scala implementation.

Co-authored with @HyukjinKwon .

This is a breakdown PR of #37863.

Why are the changes needed?

This change will be leveraged in SPARK-40434.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

N/A. We will make sure test suites are constructed via E2E manner under SPARK-40431.

@HeartSaVioR
Copy link
Contributor Author

@HyukjinKwon
I just missed adding a py file - just added it. Could you please take a look again? Thanks!

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Sep 15, 2022

https://github.com/apache/spark/pull/37889/checks?check_run_id=8369923150
https://github.com/HeartSaVioR/spark/actions/runs/3059610508/jobs/4938019684

This only fails at Python linter and I'm going to fix it.

Run PYTHON_EXECUTABLE=python3.9 ./dev/lint-python
  PYTHON_EXECUTABLE=python3.9 ./dev/lint-python
  shell: sh -e {0}
  env:
    LC_ALL: C.UTF-8
    LANG: C.UTF-8
    PYSPARK_DRIVER_PYTHON: python[3](https://github.com/HeartSaVioR/spark/actions/runs/3059610508/jobs/4938019684#step:16:3).9
    PYSPARK_PYTHON: python3.9
    JAVA_HOME_8.0.3[4](https://github.com/HeartSaVioR/spark/actions/runs/3059610508/jobs/4938019684#step:16:4)[5](https://github.com/HeartSaVioR/spark/actions/runs/3059610508/jobs/4938019684#step:16:5)_x[6](https://github.com/HeartSaVioR/spark/actions/runs/3059610508/jobs/4938019684#step:16:6)4: /__t/jdk/[8](https://github.com/HeartSaVioR/spark/actions/runs/3059610508/jobs/4938019684#step:16:8).0.345/x64
    JAVA_HOME: /__t/jdk/8.0.345/x64
    JAVA_HOME_8_0_345_X64: /__t/jdk/8.0.345/x64
starting python compilation test...
python compilation succeeded.

starting black test...
black checks failed:
would reformat python/pyspark/sql/streaming/state.py

Oh no! 💥 💔 💥
1 file would be reformatted, 353 files would be left unchanged.
Please run 'dev/reformat-python' script.
1

After fixing the lint I'll run the linter locally, and merge the PR once I confirm the linter does not complain.

@HeartSaVioR
Copy link
Contributor Author

I'll just fix it and submit it, and see the build result again. There are more checks which are skipped due to python linter failure.

@HeartSaVioR
Copy link
Contributor Author

https://github.com/HeartSaVioR/spark/runs/8376291859

The build only failed on yarn (org.apache.spark.deploy.yarn.YarnClusterSuite) which is unrelated to this change. I'm going to merge this.

LuciferYang pushed a commit to LuciferYang/spark that referenced this pull request Sep 20, 2022
…out in PySpark

### What changes were proposed in this pull request?

This PR introduces GroupStateImpl and GroupStateTimeout in PySpark, and updates Scala codebase to support convenient conversion between PySpark implementation and Scala implementation.

Co-authored with HyukjinKwon .

This is a breakdown PR of apache#37863.

### Why are the changes needed?

This change will be leveraged in SPARK-40434.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A. We will make sure test suites are constructed via E2E manner under SPARK-40431.

Closes apache#37889 from HeartSaVioR/SPARK-40432.

Lead-authored-by: Jungtaek Lim <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Jungtaek Lim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants