Skip to content

Conversation

@varant-zlai
Copy link
Collaborator

@varant-zlai varant-zlai commented May 15, 2025

Summary

replace partition spec with column -> partiton spec

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

  • Refactor
    • Simplified partition specification handling across planners and utilities by removing the custom partition spec wrapper and standardizing on a single partition spec type.
    • Updated related methods and class constructors to use the new partition spec approach, streamlining partition metadata access.
    • Removed unused fields and imports related to the old partition spec wrapper.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented May 15, 2025

Walkthrough

The changes remove the PartitionSpecWithColumn class and update all related code to use the simpler PartitionSpec type instead. Constructors, methods, and fields that previously required PartitionSpecWithColumn now use PartitionSpec, and references to partition columns are adjusted accordingly.

Changes

File(s) Change Summary
.../planner/PartitionSpecWithColumn.scala Deleted the PartitionSpecWithColumn case class.
.../planner/GroupByOfflinePlanner.scala
.../planner/JoinOfflinePlanner.scala
.../planner/Planner.scala
Changed constructors to use PartitionSpec instead of PartitionSpecWithColumn as implicit parameters.
.../planner/MetaDataUtils.scala
.../planner/TableDependencies.scala
Updated method signatures and internal logic to use PartitionSpec directly; adjusted partition field access.
.../planner/NodeRunner.scala Changed trait method from partitionSpecWithColumn to partitionSpec.
.../catalog/TableUtils.scala Removed the outputPartitionSpec field and related import.
.../JoinUtils.scala Updated internal usage to pass partitionSpec instead of outputPartitionSpec.
.../batch/MergeJob.scala Removed import of PartitionSpecWithColumn.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Planner
    participant PartitionSpec

    Caller->>Planner: new Planner(...)(implicit partitionSpec: PartitionSpec)
    Planner->>PartitionSpec: Access partition fields directly
Loading

Suggested reviewers

  • nikhil-zlai
  • piyush-zlai

Poem

Out with the wrapper, in with the spec,
Code grows simpler—what the heck!
Partition columns, now direct,
No more wrappers to inspect.
Cheers to code that's clean and neat,
Refactor done—this change is sweet! 🚀

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 4683d8c and 9bc6349.

📒 Files selected for processing (10)
  • api/src/main/scala/ai/chronon/api/planner/GroupByOfflinePlanner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/JoinOfflinePlanner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/MetaDataUtils.scala (3 hunks)
  • api/src/main/scala/ai/chronon/api/planner/NodeRunner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/PartitionSpecWithColumn.scala (0 hunks)
  • api/src/main/scala/ai/chronon/api/planner/Planner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/TableDependencies.scala (2 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/batch/MergeJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/catalog/TableUtils.scala (0 hunks)
💤 Files with no reviewable changes (2)
  • api/src/main/scala/ai/chronon/api/planner/PartitionSpecWithColumn.scala
  • spark/src/main/scala/ai/chronon/spark/catalog/TableUtils.scala
✅ Files skipped from review due to trivial changes (1)
  • spark/src/main/scala/ai/chronon/spark/batch/MergeJob.scala
🚧 Files skipped from review as they are similar to previous changes (7)
  • api/src/main/scala/ai/chronon/api/planner/NodeRunner.scala
  • api/src/main/scala/ai/chronon/api/planner/Planner.scala
  • api/src/main/scala/ai/chronon/api/planner/GroupByOfflinePlanner.scala
  • api/src/main/scala/ai/chronon/api/planner/JoinOfflinePlanner.scala
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala
  • api/src/main/scala/ai/chronon/api/planner/TableDependencies.scala
  • api/src/main/scala/ai/chronon/api/planner/MetaDataUtils.scala
⏰ Context from checks skipped due to timeout of 90000ms (31)
  • GitHub Check: service_tests
  • GitHub Check: online_tests
  • GitHub Check: cloud_aws_tests
  • GitHub Check: flink_tests
  • GitHub Check: service_tests
  • GitHub Check: cloud_gcp_tests
  • GitHub Check: service_commons_tests
  • GitHub Check: streaming_tests
  • GitHub Check: cloud_aws_tests
  • GitHub Check: cloud_gcp_tests
  • GitHub Check: analyzer_tests
  • GitHub Check: api_tests
  • GitHub Check: api_tests
  • GitHub Check: streaming_tests
  • GitHub Check: join_tests
  • GitHub Check: aggregator_tests
  • GitHub Check: online_tests
  • GitHub Check: spark_tests
  • GitHub Check: groupby_tests
  • GitHub Check: aggregator_tests
  • GitHub Check: fetcher_tests
  • GitHub Check: fetcher_tests
  • GitHub Check: flink_tests
  • GitHub Check: join_tests
  • GitHub Check: batch_tests
  • GitHub Check: groupby_tests
  • GitHub Check: spark_tests
  • GitHub Check: batch_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: analyzer_tests
  • GitHub Check: enforce_triggered_workflows

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
api/src/main/scala/ai/chronon/api/planner/PlanNode.scala (1)

46-46: Improved parameter naming

Parameter renamed from cons to confs for better clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 30f5356 and 4683d8c.

📒 Files selected for processing (13)
  • api/src/main/scala/ai/chronon/api/planner/GroupByOfflinePlanner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/JoinOfflinePlanner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/MetaDataUtils.scala (3 hunks)
  • api/src/main/scala/ai/chronon/api/planner/NodeRunner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/PartitionSpecWithColumn.scala (0 hunks)
  • api/src/main/scala/ai/chronon/api/planner/PlanNode.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/Planner.scala (1 hunks)
  • api/src/main/scala/ai/chronon/api/planner/TableDependencies.scala (2 hunks)
  • api/thrift/orchestration.thrift (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Driver.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/batch/MergeJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/catalog/TableUtils.scala (0 hunks)
💤 Files with no reviewable changes (2)
  • api/src/main/scala/ai/chronon/api/planner/PartitionSpecWithColumn.scala
  • spark/src/main/scala/ai/chronon/spark/catalog/TableUtils.scala
🧰 Additional context used
🧬 Code Graph Analysis (1)
spark/src/main/scala/ai/chronon/spark/Driver.scala (1)
api/src/main/scala/ai/chronon/api/planner/RelevantLeftForJoinPart.scala (2)
  • RelevantLeftForJoinPart (16-24)
  • RelevantLeftForJoinPart (26-92)
⏰ Context from checks skipped due to timeout of 90000ms (30)
  • GitHub Check: service_commons_tests
  • GitHub Check: service_tests
  • GitHub Check: streaming_tests
  • GitHub Check: service_tests
  • GitHub Check: streaming_tests
  • GitHub Check: cloud_gcp_tests
  • GitHub Check: join_tests
  • GitHub Check: cloud_aws_tests
  • GitHub Check: analyzer_tests
  • GitHub Check: cloud_aws_tests
  • GitHub Check: groupby_tests
  • GitHub Check: online_tests
  • GitHub Check: spark_tests
  • GitHub Check: online_tests
  • GitHub Check: fetcher_tests
  • GitHub Check: groupby_tests
  • GitHub Check: flink_tests
  • GitHub Check: flink_tests
  • GitHub Check: analyzer_tests
  • GitHub Check: cloud_gcp_tests
  • GitHub Check: batch_tests
  • GitHub Check: api_tests
  • GitHub Check: spark_tests
  • GitHub Check: api_tests
  • GitHub Check: join_tests
  • GitHub Check: aggregator_tests
  • GitHub Check: batch_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: aggregator_tests
  • GitHub Check: fetcher_tests
🔇 Additional comments (18)
spark/src/main/scala/ai/chronon/spark/Driver.scala (1)

22-22: Simplified import statement

Import statement updated to remove PartitionSpecWithColumn, aligning with broader refactoring to replace it with PartitionSpec throughout the codebase.

spark/src/main/scala/ai/chronon/spark/batch/MergeJob.scala (1)

5-5: Simplified import statement

Import statement updated to remove PartitionSpecWithColumn reference, consistent with codebase-wide refactoring.

api/thrift/orchestration.thrift (1)

265-265: Added LogicalType field to Conf struct

New optional field logicalType added to classify configuration types.

api/src/main/scala/ai/chronon/api/planner/Planner.scala (2)

3-3: Import updated to use PartitionSpec instead of PartitionSpecWithColumn.

Correctly replaces the import with the new simplified type.


7-7: Parameter type updated from PartitionSpecWithColumn to PartitionSpec.

Parameter type in the abstract Planner class correctly updated to match the codebase-wide refactoring.

api/src/main/scala/ai/chronon/api/planner/GroupByOfflinePlanner.scala (2)

3-3: Updated imports to use PartitionSpec.

Import statement correctly updated to include PartitionSpec.


8-8: Updated constructor parameter type to PartitionSpec.

Parameter type correctly changed to PartitionSpec to align with the refactoring.

api/src/main/scala/ai/chronon/api/planner/NodeRunner.scala (2)

3-3: Updated imports to include PartitionSpec.

Import statement correctly updated.


7-7: Renamed method and changed return type to PartitionSpec.

Both the method name and return type properly updated as part of the refactoring.

api/src/main/scala/ai/chronon/api/planner/JoinOfflinePlanner.scala (1)

14-14: Updated parameter type to PartitionSpec.

Constructor parameter type correctly changed to match the codebase-wide refactoring.

spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (2)

535-535: Updated to use partitionSpec instead of outputPartitionSpec.

This change aligns with removing PartitionSpecWithColumn throughout the codebase.


539-539: Updated to use partitionSpec instead of outputPartitionSpec.

Consistent with the change in line 535 and aligns with the broader refactoring effort.

api/src/main/scala/ai/chronon/api/planner/TableDependencies.scala (3)

5-5: Updated imports to include PartitionSpec.

Necessary change to support the new type structure.


9-9: Changed implicit parameter type from PartitionSpecWithColumn to PartitionSpec.

Part of the refactoring to simplify partition specification handling.


63-65: Replaced dynamic partition column with hardcoded "dt" and simplified access to format and interval.

The partition column is now hardcoded as "dt" instead of being derived from a wrapper type, consistent with refactoring approach.

api/src/main/scala/ai/chronon/api/planner/MetaDataUtils.scala (3)

4-13: Updated imports for cleaner organization.

Expanded imports to include specific entities needed after refactoring.


34-34: Changed implicit parameter type from PartitionSpecWithColumn to PartitionSpec.

Consistent with other changes in the codebase to simplify partition specification.


59-61: Simplified partition specification handling.

Hardcoded partition column to "dt" and directly accessing fields from partitionSpec, consistent with the overall refactoring approach.

Copy link
Contributor

@nikhil-zlai nikhil-zlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the partition column instead of hardcoded dt everywhere?

.setPartitionColumn(partitionSpecWithColumn.partitionColumn)
.setPartitionFormat(partitionSpecWithColumn.partitionSpec.format)
.setPartitionInterval(WindowUtils.hours(partitionSpecWithColumn.partitionSpec.spanMillis))
.setPartitionColumn("dt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix this?

@varant-zlai varant-zlai force-pushed the vz/replace_partition_spec_with_column branch from de54d58 to f0308e9 Compare May 15, 2025 20:54
.setPartitionColumn(specWithColumn.partitionColumn)
.setPartitionFormat(specWithColumn.partitionSpec.format)
.setPartitionInterval(WindowUtils.hours(specWithColumn.partitionSpec.spanMillis))
.setPartitionColumn("dt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one?

@varant-zlai varant-zlai merged commit 7413cde into main May 15, 2025
34 checks passed
@varant-zlai varant-zlai deleted the vz/replace_partition_spec_with_column branch May 15, 2025 21:58
chewy-zlai pushed a commit that referenced this pull request May 16, 2025
## Summary

replace partition spec with column -> partiton spec

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Simplified partition specification handling across planners and
utilities by removing the custom partition spec wrapper and
standardizing on a single partition spec type.
- Updated related methods and class constructors to use the new
partition spec approach, streamlining partition metadata access.
- Removed unused fields and imports related to the old partition spec
wrapper.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
@coderabbitai coderabbitai bot mentioned this pull request May 16, 2025
4 tasks
chewy-zlai pushed a commit that referenced this pull request May 16, 2025
## Summary

replace partition spec with column -> partiton spec

## Cheour clientslist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Refactor**
- Simplified partition specification handling across planners and
utilities by removing the custom partition spec wrapper and
standardizing on a single partition spec type.
- Updated related methods and class constructors to use the new
partition spec approach, streamlining partition metadata access.
- Removed unused fields and imports related to the old partition spec
wrapper.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants