-
Notifications
You must be signed in to change notification settings - Fork 9
chore: slim down TableUtils #719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis set of changes removes several partition-related utility methods ( Changes
Sequence Diagram(s)sequenceDiagram
participant Driver
participant TableUtils
Driver->>TableUtils: tableReachable(tbl)
alt Table is reachable
Driver->>TableUtils: partitions(tbl, spec.tail.toMap, partitionColumnName = spec.head._1)
TableUtils-->>Driver: List of partitions
Driver->>Driver: Check if partitions list is non-empty
else Table is not reachable
Driver->>Driver: Log and treat partition as absent
end
Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions and Pipeline Checks: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
💤 Files with no reviewable changes (2)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (31)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
8269e31 to
5873089
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
spark/src/main/scala/ai/chronon/spark/Driver.scala (1)
1016-1022: Improved partition checking with reachability guardEnhanced implementation that first verifies if the table exists before attempting to check partitions.
Consider adding a comment explaining the reachability check's purpose to improve code maintainability.
+ // Check if table exists before attempting to retrieve partitions val containsSpec = if (tableUtils.tableReachable(tbl)) {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (8)
cloud_gcp/src/test/scala/ai/chronon/integrations/cloud_gcp/BigQueryCatalogTest.scala(3 hunks)spark/src/main/scala/ai/chronon/spark/Driver.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/JoinUtils.scala(0 hunks)spark/src/main/scala/ai/chronon/spark/LabelJoin.scala(0 hunks)spark/src/main/scala/ai/chronon/spark/catalog/Format.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/catalog/Hive.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/catalog/TableUtils.scala(0 hunks)spark/src/test/scala/ai/chronon/spark/test/TableUtilsTest.scala(0 hunks)
💤 Files with no reviewable changes (4)
- spark/src/main/scala/ai/chronon/spark/LabelJoin.scala
- spark/src/test/scala/ai/chronon/spark/test/TableUtilsTest.scala
- spark/src/main/scala/ai/chronon/spark/JoinUtils.scala
- spark/src/main/scala/ai/chronon/spark/catalog/TableUtils.scala
⏰ Context from checks skipped due to timeout of 90000ms (24)
- GitHub Check: streaming_tests
- GitHub Check: groupby_tests
- GitHub Check: fetcher_tests
- GitHub Check: service_commons_tests
- GitHub Check: analyzer_tests
- GitHub Check: cloud_gcp_tests
- GitHub Check: batch_tests
- GitHub Check: cloud_aws_tests
- GitHub Check: spark_tests
- GitHub Check: service_tests
- GitHub Check: streaming_tests
- GitHub Check: online_tests
- GitHub Check: cloud_aws_tests
- GitHub Check: api_tests
- GitHub Check: service_tests
- GitHub Check: aggregator_tests
- GitHub Check: api_tests
- GitHub Check: flink_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: cloud_gcp_tests
- GitHub Check: online_tests
- GitHub Check: aggregator_tests
- GitHub Check: flink_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (6)
cloud_gcp/src/test/scala/ai/chronon/integrations/cloud_gcp/BigQueryCatalogTest.scala (3)
7-7: Consolidated import statements - good cleanupCombining related Google Cloud Hadoop imports improves readability.
104-104: Updated to usepartitionsmethod instead of removedallPartitionsCorrectly adapted test to use the remaining partition retrieval method.
114-114: Updated to usepartitionsmethod instead of removedallPartitionsCorrectly adapted test to use the remaining partition retrieval method.
spark/src/main/scala/ai/chronon/spark/catalog/Format.scala (2)
90-90: Changed return type to preserve order and allow duplicatesMethod now returns
List[(String, String)]instead ofMap[String, String]to maintain the original order of partition components.
97-97: Updated to return list instead of mapChanged
.toMapto.toListto align with the new return type.spark/src/main/scala/ai/chronon/spark/catalog/Hive.scala (1)
23-23: Added explicit map conversion to handle the changed return typeAdded
.toMapto convert the list of tuples returned byFormat.parseHiveStylePartitionto a map, maintaining compatibility with existing code.
|
cc @david-zlai PTAL, I did run the integration tests to make sure the |
## Summary
- Remove latest label view since it depends on some partition methods
taht are lightly used. We don't use this Label Join anyway anymore so
it's fine to deprecate.
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Removed Features**
- Removed support for creating and managing "latest label" views and
their associated mapping logic.
- Eliminated utility methods for checking and retrieving all table
partitions.
- **Bug Fixes**
- Improved partition presence checks to include table reachability and
more explicit partition retrieval.
- **Breaking Changes**
- Updated the return type of partition parsing to preserve order and
allow duplicate keys.
- **Tests**
- Removed tests related to partition utilities and latest label mapping.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: thomaschow <[email protected]>
## Summary
- Remove latest label view since it depends on some partition methods
taht are lightly used. We don't use this Label Join anyway anymore so
it's fine to deprecate.
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Removed Features**
- Removed support for creating and managing "latest label" views and
their associated mapping logic.
- Eliminated utility methods for checking and retrieving all table
partitions.
- **Bug Fixes**
- Improved partition presence checks to include table reachability and
more explicit partition retrieval.
- **Breaking Changes**
- Updated the return type of partition parsing to preserve order and
allow duplicate keys.
- **Tests**
- Removed tests related to partition utilities and latest label mapping.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: thomaschow <[email protected]>
## Summary
- Remove latest label view since it depends on some partition methods
taht are lightly used. We don't use this Label Join anyway anymore so
it's fine to deprecate.
## Cheour clientslist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Removed Features**
- Removed support for creating and managing "latest label" views and
their associated mapping logic.
- Eliminated utility methods for cheour clientsing and retrieving all table
partitions.
- **Bug Fixes**
- Improved partition presence cheour clientss to include table reachability and
more explicit partition retrieval.
- **Breaking Changes**
- Updated the return type of partition parsing to preserve order and
allow duplicate keys.
- **Tests**
- Removed tests related to partition utilities and latest label mapping.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to traour clients
the status of staour clientss when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: thomaschow <[email protected]>
Summary
Checklist
Summary by CodeRabbit