-
Notifications
You must be signed in to change notification settings - Fork 8
Adding verbs for modular join flow to driver #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update expands the job execution functionality. In the Python components, three new job modes are added in the constants definitions and a corresponding command-line option is integrated into the main function. In the Scala driver, three new job run objects are defined to handle source, join-part, and merge job types, each equipped with their specific arguments and execution flow. Changes
Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions and Pipeline Checks: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (15)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
spark/src/main/scala/ai/chronon/spark/Driver.scala (2)
883-944: Join part logic is solid. Consider more tests.
946-989: Merge flow is clear. Address TODO for bootstrapping?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (3)
api/python/ai/chronon/repo/constants.py(3 hunks)api/python/ai/chronon/repo/run.py(2 hunks)spark/src/main/scala/ai/chronon/spark/Driver.scala(4 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
spark/src/main/scala/ai/chronon/spark/Driver.scala (4)
api/src/main/scala/ai/chronon/api/Constants.scala (1)
Constants(23-100)api/src/main/scala/ai/chronon/api/ThriftJsonCodec.scala (1)
ThriftJsonCodec(41-137)spark/src/main/scala/ai/chronon/spark/Args.scala (3)
Args(25-41)parseConf(31-34)toString(34-40)spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (2)
JoinUtils(37-620)computeLeftSourceTableName(606-619)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: join_tests
- GitHub Check: join_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (7)
api/python/ai/chronon/repo/run.py (2)
167-167: Option aligns with new job flow.
200-201: Signature updated correctly.api/python/ai/chronon/repo/constants.py (3)
29-31: New modes fit well.
58-60: Mode bindings match arguments.
92-94: Routes align with new job modes.spark/src/main/scala/ai/chronon/spark/Driver.scala (2)
20-20: Imports are relevant.Also applies to: 22-22, 29-29
824-881:❓ Verification inconclusive
Implementation looks fine. Check startDate vs endDate to avoid empty output.
🏁 Script executed:
#!/bin/bash # Test SourceJob with identical start/end date echo "Please run: spark-submit ... --start-ds=2024-01-01 --end-ds=2024-01-01 --mode=source-job ..." echo "Verify the output is non-empty."Length of output: 267
Action: Confirm Date Range Behavior
The implementation appears sound. However, please verify manually that providing identical start and end dates (yielding a single-day range) does not result in an empty output when running the SourceJob.
Co-authored-by: Thomas Chow <[email protected]>
## Summary Adding verbs for modular join flow to driver ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update - [x] Not tested <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced three new job types: source-job, join-part-job, and merge-job. - Added a command-line option to specify the join part name for join-part jobs. - Enhanced command-line interface to execute new job types through updated subcommands. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: tchow-zlai <[email protected]> Co-authored-by: Thomas Chow <[email protected]>
## Summary Adding verbs for modular join flow to driver ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update - [x] Not tested <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced three new job types: source-job, join-part-job, and merge-job. - Added a command-line option to specify the join part name for join-part jobs. - Enhanced command-line interface to execute new job types through updated subcommands. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: tchow-zlai <[email protected]> Co-authored-by: Thomas Chow <[email protected]>
## Summary Adding verbs for modular join flow to driver ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update - [x] Not tested <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced three new job types: source-job, join-part-job, and merge-job. - Added a command-line option to specify the join part name for join-part jobs. - Enhanced command-line interface to execute new job types through updated subcommands. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: tchow-zlai <[email protected]> Co-authored-by: Thomas Chow <[email protected]>
## Summary Adding verbs for modular join flow to driver ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update - [x] Not tested <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced three new job types: source-job, join-part-job, and merge-job. - Added a command-line option to specify the join part name for join-part jobs. - Enhanced command-line interface to execute new job types through updated subcommands. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: tchow-zlai <[email protected]> Co-authored-by: Thomas Chow <[email protected]>
## Summary Adding verbs for modular join flow to driver ## Cheour clientslist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update - [x] Not tested <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced three new job types: source-job, join-part-job, and merge-job. - Added a command-line option to specify the join part name for join-part jobs. - Enhanced command-line interface to execute new job types through updated subcommands. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: tchow-zlai <[email protected]> Co-authored-by: Thomas Chow <[email protected]>
Summary
Adding verbs for modular join flow to driver
Checklist
Summary by CodeRabbit