Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Jul 25, 2023

What changes were proposed in this pull request?

it seems that run_python_packaging_tests requires some disk space and cause some pyspark modules fail, this PR is to make run_python_packaging_tests only enabled within pyspark-errors (which is the smallest pyspark test module)

image

Why are the changes needed?

1, it seems it is the run_python_packaging_tests that cause the No space left error;
2, the run_python_packaging_tests is tested in all pyspark-* test modules, should be deduplicated;

Does this PR introduce any user-facing change?

no, infra-only

How was this patch tested?

updated CI

@github-actions github-actions bot added the BUILD label Jul 25, 2023
dev/run-tests.py Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should probably deduplicate because it runs for every python module test. another way is just to add an env variable, and enable it in only one split.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just checking whether disabling this test works.
will do so if this is the root cause.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

From this workflow: https://github.com/panbingkun/spark/actions/runs/5653268655/job/15314215187 Look, it seems to be a problem with run_python_packaging_tests.

Because all the tests in pyspark-pandas-connect-part-3 have ended running, run_python_packaging_tests caused an error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure about this, since in https://github.com/zhengruifeng/spark/actions/runs/5652363908/job/15311889590 I also disabled all the packaging tests. (you can check that there is no packaging tests in pyspark-core),
but pyspark-sql and pyspark-pandas-slow-connect still failed ...

I just rebase the PR to re-test whether disabling works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I see packaging test in pyspark-core successed in https://github.com/apache/spark/actions/runs/5652170312/job/15311427056, so maybe packaging test itself is fine?

TBH, I don't figure out what is happening

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the test results, after disable run_python_packaging_tests and splitting, it was successful.
https://github.com/panbingkun/spark/actions/runs/5654907063/job/15318981268

@github-actions github-actions bot added the INFRA label Jul 25, 2023
@zhengruifeng zhengruifeng changed the title [WIP][DO_NOT_MERGE][INFRA] Disable python packaging tests [WIP][DO_NOT_MERGE][INFRA] Move python packaging tests to a separate test module Jul 25, 2023
@zhengruifeng zhengruifeng changed the title [WIP][DO_NOT_MERGE][INFRA] Move python packaging tests to a separate test module [WIP][DO_NOT_MERGE][INFRA] Move python packaging tests to a separate module Jul 25, 2023
@zhengruifeng zhengruifeng force-pushed the infra_skip_py_packing_tests branch 2 times, most recently from caf49cd to 7bd48e9 Compare July 25, 2023 07:42
@github-actions github-actions bot removed the INFRA label Jul 25, 2023
@zhengruifeng zhengruifeng force-pushed the infra_skip_py_packing_tests branch from 7bd48e9 to 35ccb99 Compare July 25, 2023 07:42
@github-actions github-actions bot added the INFRA label Jul 25, 2023
@zhengruifeng
Copy link
Contributor Author

zhengruifeng commented Jul 25, 2023

@zhengruifeng zhengruifeng force-pushed the infra_skip_py_packing_tests branch from a1f5da4 to ba071c4 Compare July 25, 2023 12:47
@zhengruifeng zhengruifeng changed the title [WIP][DO_NOT_MERGE][INFRA] Move python packaging tests to a separate module [SPARK-44544][INFRA] Move python packaging tests to a separate module Jul 25, 2023
@zhengruifeng
Copy link
Contributor Author

also cc @Yikun

@HyukjinKwon
Copy link
Member

let me take a quick look at #42159 too. I think we can just remove some directory before running pip test.

@panbingkun
Copy link
Contributor

panbingkun commented Jul 26, 2023

@zhengruifeng @HyukjinKwon
Perhaps we can disable 'run_python_packaging_tests' first to make GA to quickly recover, and then handle this matter later on.

@HyukjinKwon
Copy link
Member

yeah. let me take a quick look at #42159

@zhengruifeng zhengruifeng force-pushed the infra_skip_py_packing_tests branch from b204c5c to 134a4ff Compare July 26, 2023 02:49
@zhengruifeng zhengruifeng changed the title [SPARK-44544][INFRA] Move python packaging tests to a separate module [SPARK-44544][INFRA] Deduplicate run_python_packaging_tests Jul 26, 2023
@zhengruifeng zhengruifeng force-pushed the infra_skip_py_packing_tests branch from 604131c to c0a46fd Compare July 26, 2023 04:49
fix bash again
@zhengruifeng zhengruifeng force-pushed the infra_skip_py_packing_tests branch from c0a46fd to 04c271e Compare July 26, 2023 04:50
@zhengruifeng
Copy link
Contributor Author

zhengruifeng commented Jul 26, 2023

@HyukjinKwon I think this PR is orthogonal to #42159:

shall we go with this PR first (after all tests pass)?

@zhengruifeng
Copy link
Contributor Author

also cc @LuciferYang @dongjoon-hyun

zhengruifeng added a commit that referenced this pull request Jul 26, 2023
### What changes were proposed in this pull request?
it seems that `run_python_packaging_tests` requires some disk space and cause some pyspark modules fail, this PR is to make `run_python_packaging_tests` only enabled within `pyspark-errors` (which is the smallest pyspark test module)

![image](https://github.com/apache/spark/assets/7322292/2d37c141-15b8-4d9f-bfbd-4dd7782ab62e)

### Why are the changes needed?

1, it seems it is the `run_python_packaging_tests` that cause the `No space left` error;
2, the `run_python_packaging_tests` is tested in all `pyspark-*` test modules, should be deduplicated;

### Does this PR introduce _any_ user-facing change?
no, infra-only

### How was this patch tested?
updated CI

Closes #42146 from zhengruifeng/infra_skip_py_packing_tests.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit 748eaff)
Signed-off-by: Ruifeng Zheng <[email protected]>
@zhengruifeng
Copy link
Contributor Author

all python tests passed, merged to master/branch-3.5

@zhengruifeng zhengruifeng deleted the infra_skip_py_packing_tests branch July 26, 2023 07:53
Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

late LGTM

@panbingkun
Copy link
Contributor

all python tests passed, merged to master/branch-3.5

@zhengruifeng Maybe branch-3.4 also need it?
https://github.com/panbingkun/spark/actions/runs/5665345967/job/15350140605

@zhengruifeng
Copy link
Contributor Author

@panbingkun good catch! Let me backport it to 3.4

HyukjinKwon pushed a commit that referenced this pull request Jul 27, 2023
### What changes were proposed in this pull request?
cherry-pick #42146 to 3.4

### Why are the changes needed?
can not cherry-pick clearly, so make this PR

### Does this PR introduce _any_ user-facing change?
no, infra-only

### How was this patch tested?
updated CI

Closes #42172 from zhengruifeng/cp_fix.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
zhengruifeng added a commit that referenced this pull request Jul 27, 2023
### What changes were proposed in this pull request?
run `run_python_packaging_tests` when there are any changes in PySpark

### Why are the changes needed?
#42146 make CI run `run_python_packaging_tests` only within `pyspark-errors` (see https://github.com/apache/spark/actions/runs/5666118302/job/15359190468 and https://github.com/apache/spark/actions/runs/5668071930/job/15358091003)

![image](https://github.com/apache/spark/assets/7322292/aef5cd4c-87ee-4b52-add3-e19ca131cdf1)

but I ignored that `pyspark-errors` maybe skipped (because no related source changes), so the `run_python_packaging_tests` maybe also skipped  unexpectedly (see https://github.com/apache/spark/actions/runs/5666523657/job/15353485731)

![image](https://github.com/apache/spark/assets/7322292/c2517d39-efcf-4a95-8562-1507dad35794)

this PR is to run `run_python_packaging_tests` even if `pyspark-errors` is skipped

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
updated CI

Closes #42173 from zhengruifeng/infra_followup.

Lead-authored-by: Ruifeng Zheng <[email protected]>
Co-authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
zhengruifeng added a commit that referenced this pull request Jul 27, 2023
### What changes were proposed in this pull request?
run `run_python_packaging_tests` when there are any changes in PySpark

### Why are the changes needed?
#42146 make CI run `run_python_packaging_tests` only within `pyspark-errors` (see https://github.com/apache/spark/actions/runs/5666118302/job/15359190468 and https://github.com/apache/spark/actions/runs/5668071930/job/15358091003)

![image](https://github.com/apache/spark/assets/7322292/aef5cd4c-87ee-4b52-add3-e19ca131cdf1)

but I ignored that `pyspark-errors` maybe skipped (because no related source changes), so the `run_python_packaging_tests` maybe also skipped  unexpectedly (see https://github.com/apache/spark/actions/runs/5666523657/job/15353485731)

![image](https://github.com/apache/spark/assets/7322292/c2517d39-efcf-4a95-8562-1507dad35794)

this PR is to run `run_python_packaging_tests` even if `pyspark-errors` is skipped

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
updated CI

Closes #42173 from zhengruifeng/infra_followup.

Lead-authored-by: Ruifeng Zheng <[email protected]>
Co-authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit f794734)
Signed-off-by: Ruifeng Zheng <[email protected]>
zhengruifeng added a commit that referenced this pull request Jul 27, 2023
### What changes were proposed in this pull request?
run `run_python_packaging_tests` when there are any changes in PySpark

### Why are the changes needed?
#42146 make CI run `run_python_packaging_tests` only within `pyspark-errors` (see https://github.com/apache/spark/actions/runs/5666118302/job/15359190468 and https://github.com/apache/spark/actions/runs/5668071930/job/15358091003)

![image](https://github.com/apache/spark/assets/7322292/aef5cd4c-87ee-4b52-add3-e19ca131cdf1)

but I ignored that `pyspark-errors` maybe skipped (because no related source changes), so the `run_python_packaging_tests` maybe also skipped  unexpectedly (see https://github.com/apache/spark/actions/runs/5666523657/job/15353485731)

![image](https://github.com/apache/spark/assets/7322292/c2517d39-efcf-4a95-8562-1507dad35794)

this PR is to run `run_python_packaging_tests` even if `pyspark-errors` is skipped

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
updated CI

Closes #42173 from zhengruifeng/infra_followup.

Lead-authored-by: Ruifeng Zheng <[email protected]>
Co-authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit f794734)
Signed-off-by: Ruifeng Zheng <[email protected]>
zhengruifeng pushed a commit that referenced this pull request Jul 31, 2023
…das-slow-connect GA testing time

### What changes were proposed in this pull request?
The pr aims to balancing `pyspark-pandas-connect` and `pyspark-pandas-slow-connect` GA testing time.

### Why are the changes needed?
After pr: #42146, the difference in testing time between `pyspark-pandas-connect` and `pyspark-pandas-slow-connect` is a bit significant, which affects the overall running time. In order to make GA operation more efficient and stable.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Pass GA.
- Manually monitor GA.

Closes #42115 from panbingkun/free_disk_space.

Lead-authored-by: panbingkun <[email protected]>
Co-authored-by: panbingkun <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
viirya pushed a commit to viirya/spark-1 that referenced this pull request Oct 19, 2023
### What changes were proposed in this pull request?
cherry-pick apache#42146 to 3.4

### Why are the changes needed?
can not cherry-pick clearly, so make this PR

### Does this PR introduce _any_ user-facing change?
no, infra-only

### How was this patch tested?
updated CI

Closes apache#42172 from zhengruifeng/cp_fix.

Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
viirya pushed a commit to viirya/spark-1 that referenced this pull request Oct 19, 2023
### What changes were proposed in this pull request?
run `run_python_packaging_tests` when there are any changes in PySpark

### Why are the changes needed?
apache#42146 make CI run `run_python_packaging_tests` only within `pyspark-errors` (see https://github.com/apache/spark/actions/runs/5666118302/job/15359190468 and https://github.com/apache/spark/actions/runs/5668071930/job/15358091003)

![image](https://github.com/apache/spark/assets/7322292/aef5cd4c-87ee-4b52-add3-e19ca131cdf1)

but I ignored that `pyspark-errors` maybe skipped (because no related source changes), so the `run_python_packaging_tests` maybe also skipped  unexpectedly (see https://github.com/apache/spark/actions/runs/5666523657/job/15353485731)

![image](https://github.com/apache/spark/assets/7322292/c2517d39-efcf-4a95-8562-1507dad35794)

this PR is to run `run_python_packaging_tests` even if `pyspark-errors` is skipped

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
updated CI

Closes apache#42173 from zhengruifeng/infra_followup.

Lead-authored-by: Ruifeng Zheng <[email protected]>
Co-authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
(cherry picked from commit f794734)
Signed-off-by: Ruifeng Zheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants