-
Notifications
You must be signed in to change notification settings - Fork 9
Revert Flink 1 task slot per TM and bump parallelism #565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis pull request updates resource configuration settings for Flink job submissions. In the GCP integration, task manager memory settings and task slots in Changes
Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions and Pipeline Checks: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
⏰ Context from checks skipped due to timeout of 90000ms (3)
🔇 Additional comments (4)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
## Summary We do see our existing Flink jobs (beacon listing actions) are just a touch overscaled. This seems to work to absorb event spikes but can be problematic if we're catching up when the job is down for some time. This PR bumps our parallelism up and also reverts the setting where we were going with 1 task slot / TM. We don't need that anymore as we've patched our catalyst code to handle generate exec nodes in the plan. So we can go back to running with task slots / TM. So we'll need the same resources as prior to this PR but get 2x the parallelism to allow us to catch up quicker. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Enhanced resource management and processing parallelism to improve performance under load. - Adjusted data scaling for more efficient and responsive streaming operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary We do see our existing Flink jobs (beacon listing actions) are just a touch overscaled. This seems to work to absorb event spikes but can be problematic if we're catching up when the job is down for some time. This PR bumps our parallelism up and also reverts the setting where we were going with 1 task slot / TM. We don't need that anymore as we've patched our catalyst code to handle generate exec nodes in the plan. So we can go back to running with task slots / TM. So we'll need the same resources as prior to this PR but get 2x the parallelism to allow us to catch up quicker. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Enhanced resource management and processing parallelism to improve performance under load. - Adjusted data scaling for more efficient and responsive streaming operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary We do see our existing Flink jobs (beacon listing actions) are just a touch overscaled. This seems to work to absorb event spikes but can be problematic if we're catching up when the job is down for some time. This PR bumps our parallelism up and also reverts the setting where we were going with 1 task slot / TM. We don't need that anymore as we've patched our catalyst code to handle generate exec nodes in the plan. So we can go back to running with task slots / TM. So we'll need the same resources as prior to this PR but get 2x the parallelism to allow us to catch up quicker. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Enhanced resource management and processing parallelism to improve performance under load. - Adjusted data scaling for more efficient and responsive streaming operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary We do see our existing Flink jobs (beacon listing actions) are just a touch overscaled. This seems to work to absorb event spikes but can be problematic if we're catching up when the job is down for some time. This PR bumps our parallelism up and also reverts the setting where we were going with 1 task slot / TM. We don't need that anymore as we've patched our catalyst code to handle generate exec nodes in the plan. So we can go back to running with task slots / TM. So we'll need the same resources as prior to this PR but get 2x the parallelism to allow us to catch up quicker. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Enhanced resource management and processing parallelism to improve performance under load. - Adjusted data scaling for more efficient and responsive streaming operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary We do see our existing Flink jobs (beacon listing actions) are just a touch overscaled. This seems to work to absorb event spikes but can be problematic if we're catching up when the job is down for some time. This PR bumps our parallelism up and also reverts the setting where we were going with 1 task slot / TM. We don't need that anymore as we've patched our catalyst code to handle generate exec nodes in the plan. So we can go baour clients to running with task slots / TM. So we'll need the same resources as prior to this PR but get 2x the parallelism to allow us to catch up quiour clientser. ## Cheour clientslist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Enhanced resource management and processing parallelism to improve performance under load. - Adjusted data scaling for more efficient and responsive streaming operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
We do see our existing Flink jobs (beacon listing actions) are just a touch overscaled. This seems to work to absorb event spikes but can be problematic if we're catching up when the job is down for some time. This PR bumps our parallelism up and also reverts the setting where we were going with 1 task slot / TM. We don't need that anymore as we've patched our catalyst code to handle generate exec nodes in the plan. So we can go back to running with task slots / TM. So we'll need the same resources as prior to this PR but get 2x the parallelism to allow us to catch up quicker.
Checklist
Summary by CodeRabbit