-
Notifications
You must be signed in to change notification settings - Fork 29k
[DO-NOT-MERGE][SPARK-XXXXX][CORE][3.0] Port back SPARK-32557 and SPARK-33146 to branch-3.0 #30051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO-NOT-MERGE][SPARK-XXXXX][CORE][3.0] Port back SPARK-32557 and SPARK-33146 to branch-3.0 #30051
Conversation
… History server ### What changes were proposed in this pull request? This PR adds a try catch wrapping the History server scan logic to log and swallow the exception per entry. ### Why are the changes needed? As discussed in apache#29350 , one entry failure shouldn't affect others. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually tested. Closes apache#29374 from yanxiaole/SPARK-32557. Authored-by: Yan Xiaole <xiaole.yan@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…oading new applications in SHS"" This reverts commit e40c147.
|
This is a proposal on handling backport of #30037 to branch-3.0. |
|
Thank you for investigating, @HeartSaVioR ! Let's see. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #129813 has finished for PR 30051 at commit
|
|
@dongjoon-hyun The tests passed - are you OK with the approach I proposed? |
|
retest this, please |
|
Test build #129975 has finished for PR 30051 at commit
|
|
retest this, please |
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status failure |
|
Test build #129977 has finished for PR 30051 at commit
|
|
I'll go with this approach as this is a simplest way to deal with. Cherry-picking two commits to branch-3.0. |
|
Hey, @HeartSaVioR . I don't think the revert of revert is a correct way here. At least, if you really want to 02f80cf Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading new applications in SHS"" The way I see this is the following.
In general, |
|
There're two points here if I understand correctly, so I'll comment each point.
I never used
I'm sorry but I disagree, because of the different view on root cause of the issue. The original commit was wrong because SPARK-32557 didn't land to branch-3.0, despite it was a follow-up of SPARK-32529 which was landed to branch-3.0. While the type of SPARK-32557 is marked as "improvement" (I guess that was the reason it didn't land to branch-3.0), I can simply turn it to "bug" as it actually fixes a bug in real world. That's a two sides of the coin - unless the improvement doesn't change the functionality, it is somewhat likely possible to fix a bug. The strict rule about when to backport (backport only for a bug type) is really not pragmatic and that should be only enforced when we don't believe others (while that can be still used as a guideline). Is it the case? In my view I did two works 1) SPARK-32557 is missing in branch-3.0 so I "corrected" it. 2) After that I cherry picked SPARK-33146 to branch-3.0 again as "usual practice" on merging phase. To not break again I made WIP PR to confirm these works don't break anything. I'm not sure I didn't respect some policy here. Btw, that was my bad I didn't do some verification after cherry-picking and simply pushing. Though that was my bad, I think it's arguable (and probably meaningful) topic that how many verifications can be done during cherry-picking. IMHO, enforcing to build "without test" on cherry-picked branch is somewhat pragmatic, as it requires around 15+ mins which may be acceptable. That still breaks the flow, but well, we may account it for "responsibility". Building with test is completely different story - we'll let our development environment be stuck for 3~4 hours, which doesn't seem to be something we want to enforce to mergers. If that is enforced I expect the negative impact on avoiding to port back while it's ideal to port back. (Again this wouldn't happen if we ported back SPARK-32557.) Always requesting to contributors to make a backport PRs and require them to hang around for more than 4 hours isn't also the solution - I think it's too hard for volunteers (both mergers and contributors). Accounting all of questions I raised, I guess it costs less we tolerate the possibility of test failures and fix it later. |
|
Okay, it seems that we agree to disagree. Thank you for sharing your opinion. |
NOTE: Do not merge. If the test passes and we are OK with the direction, I'll cherry-pick these commits one by one manually.
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
How was this patch tested?