Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

This PR proposes to deflake increase interrupt tag at SparkSessionE2ESuite by:

  • Increase timeout for interrupt tag at SparkSessionE2ESuite.
  • Reduce the number of tasks in parallel.

Why are the changes needed?

To fix the flakiness:

- interrupt tag *** FAILED ***
  The code passed to eventually never returned normally. Attempted 30 times over 20.037421464999998 seconds. Last failure message: ListBuffer("2beba4ac-a994-45f5-bd46-fca3e43fb5ef") had length 1 instead of expected length 2 Interrupted operations: ListBuffer(2beba4ac-a994-45f5-bd46-fca3e43fb5ef).. (SparkSessionE2ESuite.scala:216)

https://github.com/apache/spark/actions/runs/7959951623/job/21727929211

The test failed because the interruption took more than 20 seconds, and it launches too many tasks to run.

Does this PR introduce any user-facing change?

No, test-only.

How was this patch tested?

CI in this PR should validate them. The flakiness can't easily be reproduced in my local.

Was this patch authored or co-authored using generative AI tooling?

No.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you for starting the investigation.

@HyukjinKwon
Copy link
Member Author

Merged to master.

dongjoon-hyun added a commit that referenced this pull request May 3, 2024
…` for `interrupt tag` test

### What changes were proposed in this pull request?

This is a follow-up to increase `timeout` from `30s` to `1 minute` like the other timeouts of the same test case.
- #45173

### Why are the changes needed?

To reduce the flakiness more. The following is the recent failure on `master` branch.
- https://github.com/apache/spark/actions/runs/8944948827/job/24572965877
- https://github.com/apache/spark/actions/runs/8945375279/job/24574263993

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46374 from dongjoon-hyun/SPARK-47097.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
LuciferYang pushed a commit that referenced this pull request May 6, 2024
…nterrupt tag` test

### What changes were proposed in this pull request?

This PR aims to disable  a flaky test, `SparkSessionE2ESuite.interrupt tag`, temporarily.

To re-enable this, SPARK-48139 is created as a blocker issue for 4.0.0.

### Why are the changes needed?

This test case was added at `Apache Spark 3.5.0` but has been unstable unfortunately until now.
- #42009

We tried to stabilize this test case before `Apache Spark 4.0.0-preview`.
- #45173
- #46374

However, it's still flaky.

- https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 (Master, 2024-05-05)
- https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 (Master, 2024-05-04)

This PR aims to stablize CI first and to focus this flaky issue as a blocker level before going on `Spark Connect GA` in SPARK-48139 before Apache Spark 4.0.0.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46396 from dongjoon-hyun/SPARK-48138.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
dongjoon-hyun added a commit that referenced this pull request May 8, 2024
…nterrupt tag` test

### What changes were proposed in this pull request?

This PR aims to disable  a flaky test, `SparkSessionE2ESuite.interrupt tag`, temporarily.

To re-enable this, SPARK-48139 is created as a blocker issue for 4.0.0.

### Why are the changes needed?

This test case was added at `Apache Spark 3.5.0` but has been unstable unfortunately until now.
- #42009

We tried to stabilize this test case before `Apache Spark 4.0.0-preview`.
- #45173
- #46374

However, it's still flaky.

- https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 (Master, 2024-05-05)
- https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 (Master, 2024-05-04)

This PR aims to stablize CI first and to focus this flaky issue as a blocker level before going on `Spark Connect GA` in SPARK-48139 before Apache Spark 4.0.0.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46396 from dongjoon-hyun/SPARK-48138.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
(cherry picked from commit 8294c59)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants