-
Notifications
You must be signed in to change notification settings - Fork 9
WIP -- Not ready to merge -- Spark refactor #311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… elastic search. (#65) ## Summary Adds a temporal service with temporal admin tools, a temporal ui, and elastic search to the Docker PoC setup. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced new services: MySQL, Temporal, Temporal Admin Tools, and Temporal UI. - Added a new network, Temporal Network, to enhance service communication. - **Changes** - Adjusted port mapping for the Spark service to improve accessibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - https://app.asana.com/0/1208785567265389/1208812512114700 - This PR addresses some flaky unit test behavior that we've been observing in the zipline fork. See: https://zipline-2kh4520.slack.com/archives/C072LUA50KA/p1732043073171339?thread_ts=1732042778.209419&cid=C072LUA50KA - A previous [CI test](https://github.com/zipline-ai/chronon/actions/runs/11946764068/job/33301642119?pr=72 ) shows that `other_spark_tests` intermittently fails due to a couple reasons. This PR addresses the flakiness of [FeatureWithLabelJoinTest .testFinalViewsWithAggLabel]( https://github.com/zipline-ai/chronon/blob/6cb6273551e024d6eecb068f754b510ae0aac464/spark/src/test/scala/ai/chronon/spark/test/FeatureWithLabelJoinTest.scala#L118), where sometimes the test assertion fails with an unexpected result value. ### Synopsis Looks like during a rewrite/refactoring of the code, we did not preserve the functionality. The diff starts to happen at the time of computing label joins per partition range, in particular when we materialize the label join and [scan it back](https://github.com/zipline-ai/chronon/blob/b64f44d57c90367ccfcb5d5c96327a1ef820e2b3/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L200). In the OSS version, the [scan](https://github.com/airbnb/chronon/blob/6968c5c29b6e48867f8c08f2b9b8281f09d47c16/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L192-L193) applies a [partition filter](https://github.com/airbnb/chronon/blob/6968c5c29b6e48867f8c08f2b9b8281f09d47c16/spark/src/main/scala/ai/chronon/spark/DataRange.scala#L102-L104). We dropped these partition filters during the [refactoring](c6a377c#diff-57b1d6132977475fa0e87a71f017e66f4a7c94f466f911b33e9178598c6c058dL97-R102) on Zipline side. As such, the physical plans produced by these two scans are different: ``` // Zipline == Physical Plan == *(1) ColumnarToRow +- FileScan parquet spark_catalog.final_join.label_agg_table_listing_labels_agg[listing#53934L,is_active_max_5d#53935,label_ds#53936] Batched: true, DataFilters: [], Format: Parquet, Location: CatalogFileIndex(1 paths)[file:/tmp/chronon/spark-warehouse_6fcd3d/data/final_join.db/label_agg_t..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<listing:bigint,is_active_max_5d:int> ``` ``` // OSS == Physical Plan == Coalesce 1000 +- *(1) ColumnarToRow +- FileScan parquet final_join_xggqlu.label_agg_table_listing_labels_agg[listing#50981L,is_active_max_5d#50982,label_ds#50983] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/tmp/chronon/spark-warehouse_69002f/data/final_join_xggqlu.db/label_agg_ta..., PartitionFilters: [isnotnull(label_ds#50983), (label_ds#50983 >= 2022-10-07), (label_ds#50983 <= 2022-10-07)], PushedFilters: [], ReadSchema: struct<listing:bigint,is_active_max_5d:int> ``` Note that OSS has a non-empty partition filter: `PartitionFilters: [isnotnull(label_ds#50983), (label_ds#50983 >= 2022-10-07), (label_ds#50983 <= 2022-10-07)]` where Zipline does not. The fix is to add these partition filters back, as done in this PR. ~### Abandoned Investigation~ ~It looks like there is some non-determinism computing one of the intermittent dataframes when computing label joins. [`dropDuplicates`](https://github.com/zipline-ai/chronon/blob/6cb6273551e024d6eecb068f754b510ae0aac464/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L215) seems to be operating on a row compound key `rowIdentifier`, which doesn't produce deterministic results. As such we sometimes lose the expected values. This [change](https://github.com/airbnb/chronon/pull/380/files#diff-2c74cac973e1af38b615f654fee5b0261594a2b0005ecfd5a8f0941b8e348eedR156) was introduced in OSS upstream almost 2 years ago. This [test](airbnb/chronon#435) was contributed a couple months after .~ ~See debugger local values comparison. The left side is test failure, and right side is test success.~ ~<img width="1074" alt="Screenshot 2024-11-21 at 9 26 04 AM" src="https://github.com/user-attachments/assets/0eba555c-43ab-48a6-bf61-bbb7b4fa2445">~ ~Removing the `dropDuplicates` call will allow the tests to pass. However, unclear if this will produce the semantically correct behavior, as the tests themselves seem~ ## Checklist - [x] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Reintroduced a testing method to validate label joins, ensuring accuracy in data processing. - **Improvements** - Enhanced data retrieval logic for label joins, emphasizing unique entries and clearer range specifications. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Load local resources irrespective of where the tests are currently being run from. This allows us to run them from Intellij. - ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [x] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved test robustness by replacing hardcoded file paths with dynamic resource URI retrieval for loading test data. - **Tests** - Enhanced flexibility in test cases for locating resources, ensuring consistent access regardless of the working directory. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
## Summary Switches to create-summary-dataset, and provides conf-path to summarize-and-upload ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated command for creating summary datasets to improve clarity and functionality. - Enhanced configuration handling for summary data uploads. - **Bug Fixes** - Maintained consistent error handling to ensure reliability during execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Some of the `logger.info` invocations weren't happening. That's mainly because we don't have a logger implementation specified at least in the spark build. This PR adds a `logback` implementation and a basic configuration as such. - This will allow us to see log messages through the command line. I tested this: `sbt "testOnly ai.chronon.spark.test.FeatureWithLabelJoinTest" | grep "== Features DF =="` - I also verified that the dep tree shows the new logger deps were present: ``` sbt spark/dependencyTree [info] welcome to sbt 1.8.2 (Oracle Corporation Java 17.0.2) [info] loading settings for project chronon-build from plugins.sbt ... [info] loading project definition from /Users/thomaschow/zipline-ai/chronon/project [info] loading settings for project root from build.sbt,version.sbt ... [info] resolving key references (13698 settings) ... [info] set current project to chronon (in build file:/Users/thomaschow/zipline-ai/chronon/) [info] spark:spark_2.12:0.1.0-SNAPSHOT [S] [info] +-aggregator:aggregator_2.12:0.1.0-SNAPSHOT [S] [info] | +-api:api_2.12:0.1.0-SNAPSHOT [S] [info] | | +-org.scala-lang.modules:scala-collection-compat_2.12:2.11.0 [S] [info] | | +-org.scala-lang:scala-reflect:2.12.18 [S] [info] | | [info] | +-com.google.code.gson:gson:2.10.1 [info] | +-org.apache.datasketches:datasketches-java:6.1.0 [info] | +-org.apache.datasketches:datasketches-memory:3.0.1 [info] | [info] +-ch.qos.logback:logback-classic:1.2.11 [info] | +-ch.qos.logback:logback-core:1.2.11 [info] | +-org.slf4j:slf4j-api:1.7.36 [info] | [info] +-com.google.guava:guava:33.3.1-jre [info] | +-com.google.code.findbugs:jsr305:3.0.2 [info] | +-com.google.errorprone:error_prone_annotations:2.28.0 [info] | +-com.google.guava:failureaccess:1.0.2 [info] | +-com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-gu.. [info] | +-com.google.j2objc:j2objc-annotations:3.0.0 [info] | +-org.checkerframework:checker-qual:3.43.0 [info] | [info] +-jakarta.servlet:jakarta.servlet-api:4.0.3 [info] +-online:online_2.12:0.1.0-SNAPSHOT [S] [info] +-aggregator:aggregator_2.12:0.1.0-SNAPSHOT [S] [info] | +-api:api_2.12:0.1.0-SNAPSHOT [S] [info] | | +-org.scala-lang.modules:scala-collection-compat_2.12:2.11.0 [S] [info] | | +-org.scala-lang:scala-reflect:2.12.18 [S] [info] | | [info] | +-com.google.code.gson:gson:2.10.1 [info] | +-org.apache.datasketches:datasketches-java:6.1.0 [info] | +-org.apache.datasketches:datasketches-memory:3.0.1 [info] | [info] +-com.datadoghq:java-dogstatsd-client:4.4.1 [info] | +-com.github.jnr:jnr-unixsocket:0.36 [info] | +-com.github.jnr:jnr-constants:0.9.17 [info] | +-com.github.jnr:jnr-enxio:0.30 [info] | | +-com.github.jnr:jnr-constants:0.9.17 [info] | | +-com.github.jnr:jnr-ffi:2.1.16 [info] | | +-com.github.jnr:jffi:1.2.23 [info] | | +-com.github.jnr:jnr-a64asm:1.0.0 [info] | | +-com.github.jnr:jnr-x86asm:1.0.2 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-commons:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-util:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-com.github.jnr:jnr-ffi:2.1.16 [info] | | +-com.github.jnr:jffi:1.2.23 [info] | | +-com.github.jnr:jnr-a64asm:1.0.0 [info] | | +-com.github.jnr:jnr-x86asm:1.0.2 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-commons:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-util:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-com.github.jnr:jnr-posix:3.0.61 [info] | +-com.github.jnr:jnr-constants:0.9.17 [info] | +-com.github.jnr:jnr-ffi:2.1.16 [info] | +-com.github.jnr:jffi:1.2.23 [info] | +-com.github.jnr:jnr-a64asm:1.0.0 [info] | +-com.github.jnr:jnr-x86asm:1.0.2 [info] | +-org.ow2.asm:asm-analysis:7.1 [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm-commons:7.1 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm-tree:7.1 [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm-util:7.1 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm:7.1 [info] | [info] +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] +-com.fasterxml.jackson.core:jackson-databind:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] | [info] +-com.fasterxml.jackson.module:jackson-module-scala_2.12:2.15.2 [S] [info] | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-databind:2.15.2 [info] | | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2 [info] | | +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] | | [info] | +-com.thoughtworks.paranamer:paranamer:2.8 [info] | [info] +-com.github.ben-manes.caffeine:caffeine:3.1.8 [info] | +-com.google.errorprone:error_prone_annotations:2.21.1 (evicted by: 2.28.. [info] | +-com.google.errorprone:error_prone_annotations:2.28.0 [info] | +-org.checkerframework:checker-qual:3.37.0 (evicted by: 3.43.0) [info] | +-org.checkerframework:checker-qual:3.43.0 [info] | [info] +-net.jodah:typetools:0.6.3 [info] +-org.rogach:scallop_2.12:5.1.0 [S] [info] +-org.scala-lang.modules:scala-java8-compat_2.12:1.0.2 [S] ``` - Additional steps are required for Intellij to behave the same way. I needed to configure the classpath `-cp chronon.spark` in the run configuration: <img width="953" alt="Screenshot 2024-11-21 at 3 34 50 PM" src="https://github.com/user-attachments/assets/aebbc466-a207-43d0-9f6f-a9bfa811eb66"> and same for `ScalaTest` . I updated the local setup to reflect this: https://docs.google.com/document/d/1k9_aQ3tkW5wvzKyXSsWWPK6HZxX4t8zVPVThODpZqQs/edit?tab=t.0#heading=h.en6opahtqp7u ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [x] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new logging configuration with a `logback.xml` file for enhanced logging capabilities. - Added support for overriding dependencies in the build configuration for improved dependency management. - **Bug Fixes** - Ensured consistent logging library versions across the project to avoid potential conflicts. - **Chores** - Streamlined dependency declarations for better organization within the build configuration. - Improved logging feedback during the build process. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Hopefully the last massive PR, since we have a lot of the baseline implemented here. Here is a [video walkthrough](https://drive.google.com/file/d/1gWpBHD7sDt2Kz7net73w1H-lY8rzVt3p/view?usp=sharing). Main changes are noted: - Figma match (left sidebar, models table, observability page) - Geist font is used in ECharts - Geist font has proper smoothness to match figma - Custom tooltip treatment below chart on hover (hold cmd to lock and keep it open) - Drill down charts using sample data ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced new components: `ActionButtons`, `TrueFalseBadge`, `CustomEChartLegend`, `EChartTooltip`, and `InfoTooltip`. - Enhanced date range selection with new options and improved handling. - Added custom tooltip functionality for charts, improving interactivity. - Implemented a new `load` function for redirecting users from the root path to `/models`. - **Improvements** - Updated styling for various components, enhancing visual consistency and user experience. - Refactored navigation and layout components for better usability. - Enhanced chart interactions and visibility management in the model performance visualization. - Improved color management system with new CSS custom properties. - Updated font size and color configurations in Tailwind CSS for better customization. - **Bug Fixes** - Corrected typos and improved variable naming for clarity. - **Chores** - Updated dependencies and improved documentation for better maintainability. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Fixes https://github.com/zipline-ai/chronon/security/dependabot/5 ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated the `package.json` to include an `overrides` section for the `cross-spawn` package, specifying version `^7.0.6`. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary ironing out a couple of bugs in drift metrics ## Checklist - [x] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced error handling for percentiles metric to manage null values effectively. - Improved logging and filtering in the summarization process for better clarity and performance. - **Bug Fixes** - Strengthened test assertions for drift and summary series to ensure data integrity and accuracy. - **Tests** - Updated test logic to aggregate null counts and total entries, enhancing the robustness of the testing framework. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated the `.gitignore` file to exclude Elastic Search-related data from version control. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary This makes DYNAMO_ENDPOINT and AWS_DEFAULT_REGION optional values instead of required. It also allows the app and frontend to be on different ip addresses, managed by kubernetes. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced handling of allowed hosts and CORS settings for better integration with various services. - **Bug Fixes** - Improved flexibility in the initialization logic of the DynamoDbClient, allowing for optional environment variables without throwing immediate exceptions. - **Chores** - Updated environment variable declarations in the Dockerfile for standardized syntax. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced command dialog functionality with a new message display for empty search results. - Improved user feedback when searches yield no results, providing a clearer indication of the outcome. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
 ### Snyk has created this PR to fix 1 vulnerabilities in the pip dependencies of this project. #### Snyk changed the following file(s): - `docker-init/requirements.txt` <details> <summary>⚠️ <b>Warning</b></summary> ``` oto3 1.28.62 has requirement botocore<1.32.0,>=1.31.62, but you have botocore 1.33.13. ``` </details> --- > [!IMPORTANT] > > - Check the changes in this PR to ensure they won't cause issues with your project. > - Max score is 1000. Note that the real score may have changed since the PR was raised. > - This PR was automatically created by Snyk using the credentials of a real user. > - Snyk has automatically assigned this pull request, [set who gets assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration). > - Some vulnerabilities couldn't be fully fixed and so Snyk will still find them when the project is tested again. This may be because the vulnerability existed within more than one direct dependency, but not all of the affected dependencies could be upgraded. --- **Note:** _You are seeing this because you or someone else with access to this repository has authorized Snyk to open fix PRs._ For more information: <img src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiI5NDMzYzNmNy0zNDk5LTRkZjAtODkwMi1iOTViZGQ5MWZkYTQiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6Ijk0MzNjM2Y3LTM0OTktNGRmMC04OTAyLWI5NWJkZDkxZmRhNCJ9fQ==" width="0" height="0"/> 🧐 [View latest project report](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr) 👩💻 [Set who automatically gets assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration) 📜 [Customise PR templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates) 🛠 [Adjust project settings](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr/settings) 📚 [Read about Snyk's upgrade logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities) --- **Learn how to fix vulnerabilities with free interactive lessons:** 🦉 [Learn about vulnerability in an interactive lesson of Snyk Learn.](https://learn.snyk.io/?loc=fix-pr) [//]: # 'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"aiohttp","from":"3.8.6","to":"3.10.11"}],"env":"prod","issuesToFix":["SNYK-PYTHON-AIOHTTP-8383923"],"prId":"9433c3f7-3499-4df0-8902-b95bdd91fda4","prPublicId":"9433c3f7-3499-4df0-8902-b95bdd91fda4","packageManager":"pip","priorityScoreList":[601],"projectPublicId":"736d6244-1782-4006-a12e-fdfbd8a4a213","projectUrl":"https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["updated-fix-title","pr-warning-shown","priorityScore"],"type":"auto","upgrade":[],"vulns":["SNYK-PYTHON-AIOHTTP-8383923"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}' Co-authored-by: snyk-bot <[email protected]> Co-authored-by: tchow <[email protected]>
 <h3>Snyk has created this PR to upgrade tailwind-merge from 2.5.3 to 2.5.4.</h3> :information_source: Keep your dependencies up-to-date. This makes it easier to fix existing vulnerabilities and to more quickly identify and fix newly disclosed vulnerabilities when they affect your project. <hr/> - The recommended version is **4 versions** ahead of your current version. - The recommended version was released on **a month ago**. <details> <summary><b>Release notes</b></summary> <br/> <details> <summary>Package name: <b>tailwind-merge</b></summary> <ul> <li> <b>2.5.4</b> - <a href="https://github.com/dcastil/tailwind-merge/releases/tag/v2.5.4">2024-10-14</a></br><h3>Bug Fixes</h3> <ul> <li>Fix incorrect paths within sourcemaps by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/dcastil/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/dcastil">@ dcastil</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2585699167" data-permission-text="Title is private" data-url="dcastil/tailwind-merge#483" data-hovercard-type="pull_request" data-hovercard-url="/dcastil/tailwind-merge/pull/483/hovercard" href="https://github.com/dcastil/tailwind-merge/pull/483">#483</a></li> </ul> <p><strong>Full Changelog</strong>: <a class="commit-link" href="https://github.com/dcastil/tailwind-merge/compare/v2.5.3...v2.5.4"><tt>v2.5.3...v2.5.4</tt></a></p> <p>Thanks to <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/brandonmcconnell/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/brandonmcconnell">@ brandonmcconnell</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/manavm1990/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/manavm1990">@ manavm1990</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/langy/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/langy">@ langy</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jamesreaco/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jamesreaco">@ jamesreaco</a>, <a class="user-mention notranslate" data-hovercard-type="organization" data-hovercard-url="/orgs/roboflow/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/roboflow">@ roboflow</a> and <a class="user-mention notranslate" data-hovercard-type="organization" data-hovercard-url="/orgs/codecov/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/codecov">@ codecov</a> for sponsoring tailwind-merge! ❤️</p> </li> <li> <b>2.5.4-dev.aac29dcdc25353cd05d708b8528c844a335ac25f</b> - 2024-10-20 </li> <li> <b>2.5.4-dev.a57f245d6ae3ce80627d4546940972f6e140ead3</b> - 2024-10-14 </li> <li> <b>2.5.4-dev.4dc0491f877f97cd5b9d7cc6d0bb87c385a0def8</b> - 2024-10-20 </li> <li> <b>2.5.3</b> - <a href="https://github.com/dcastil/tailwind-merge/releases/tag/v2.5.3">2024-10-03</a></br><h3>Bug Fixes</h3> <ul> <li>Add missing logical border color properties by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/sherlockdoyle/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/sherlockdoyle">@ sherlockdoyle</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2561872126" data-permission-text="Title is private" data-url="dcastil/tailwind-merge#478" data-hovercard-type="pull_request" data-hovercard-url="/dcastil/tailwind-merge/pull/478/hovercard" href="https://github.com/dcastil/tailwind-merge/pull/478">#478</a></li> </ul> <h3>Documentation</h3> <ul> <li>Add benchmark reporting to PRs and commits by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/XantreDev/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/XantreDev">@ XantreDev</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2459787149" data-permission-text="Title is private" data-url="dcastil/tailwind-merge#455" data-hovercard-type="pull_request" data-hovercard-url="/dcastil/tailwind-merge/pull/455/hovercard" href="https://github.com/dcastil/tailwind-merge/pull/455">#455</a></li> </ul> <h3>Other</h3> <ul> <li>Switch test suite to vitest by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/dcastil/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/dcastil">@ dcastil</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2472026355" data-permission-text="Title is private" data-url="dcastil/tailwind-merge#461" data-hovercard-type="pull_request" data-hovercard-url="/dcastil/tailwind-merge/pull/461/hovercard" href="https://github.com/dcastil/tailwind-merge/pull/461">#461</a></li> </ul> <p><strong>Full Changelog</strong>: <a class="commit-link" href="https://github.com/dcastil/tailwind-merge/compare/v2.5.2...v2.5.3"><tt>v2.5.2...v2.5.3</tt></a></p> <p>Thanks to <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/brandonmcconnell/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/brandonmcconnell">@ brandonmcconnell</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/manavm1990/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/manavm1990">@ manavm1990</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/langy/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/langy">@ langy</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jamesreaco/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jamesreaco">@ jamesreaco</a>, <a class="user-mention notranslate" data-hovercard-type="organization" data-hovercard-url="/orgs/roboflow/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/roboflow">@ roboflow</a>, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/xeger/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/xeger">@ xeger</a> and <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/MrDeatHHH/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/MrDeatHHH">@ MrDeatHHH</a> for sponsoring tailwind-merge! ❤️</p> </li> </ul> from <a href="https://github.com/dcastil/tailwind-merge/releases">tailwind-merge GitHub release notes</a> </details> </details> --- > [!IMPORTANT] > > - Check the changes in this PR to ensure they won't cause issues with your project. > - This PR was automatically created by Snyk using the credentials of a real user. > - Snyk has automatically assigned this pull request, [set who gets assigned](/settings/integration). --- **Note:** _You are seeing this because you or someone else with access to this repository has authorized Snyk to open upgrade PRs._ **For more information:** <img src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiIxMGZlNTA0Zi0yNzUzLTQxYTMtYWZkYi1iZGI0MWY0NDBlMTAiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6IjEwZmU1MDRmLTI3NTMtNDFhMy1hZmRiLWJkYjQxZjQ0MGUxMCJ9fQ==" width="0" height="0"/> > - 🧐 [View latest project report](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source=github&utm_medium=referral&page=upgrade-pr) > - 👩💻 [Set who automatically gets assigned](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source=github&utm_medium=referral&page=upgrade-pr/) > - 📜 [Customise PR templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates) > - 🛠 [Adjust upgrade PR settings](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source=github&utm_medium=referral&page=upgrade-pr) > - 🔕 [Ignore this dependency or unsubscribe from future upgrade PRs](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?pkg=tailwind-merge&utm_source=github&utm_medium=referral&page=upgrade-pr#auto-dep-upgrades) [//]: # 'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"tailwind-merge","from":"2.5.3","to":"2.5.4"}],"env":"prod","hasFixes":false,"isBreakingChange":false,"isMajorUpgrade":false,"issuesToFix":[],"prId":"10fe504f-2753-41a3-afdb-bdb41f440e10","prPublicId":"10fe504f-2753-41a3-afdb-bdb41f440e10","packageManager":"npm","priorityScoreList":[],"projectPublicId":"f4bdc116-d05b-4937-96b5-b1f9a02872e5","projectUrl":"https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source=github&utm_medium=referral&page=upgrade-pr","prType":"upgrade","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":[],"type":"auto","upgrade":[],"upgradeInfo":{"versionsDiff":4,"publishedDate":"2024-10-14T11:21:41.676Z"},"vulns":[]}' Co-authored-by: snyk-bot <[email protected]> Co-authored-by: tchow <[email protected]>
## Summary I have moved everything from `/models/model_name` to `/joins/join_name`. I also created a shared entity object and show groupbys and joins in search results. PR walkthrough video [here](https://drive.google.com/file/d/10lnso4MGXuXlmr5F-aLzDBwWncGBuEmt/view?usp=sharing) Limitations: - You can't click on a model or groupby from search - Backend search only queries models (so matches to joins or groupbys do not come up) ([details](https://github.com/zipline-ai/chronon/pull/82/files#r1855120136)) Future: - Removing anything not related to joins (model performance, skew, etc) ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Enhanced navigation with dynamic filtering of entities in the NavigationBar. - Introduced a detailed table view for "Joins" displaying relevant model information. - **Bug Fixes** - Updated redirection from the root URL to the "Joins" page. - **Removals** - Removed outdated placeholder components for "GroupBys" and "Models" pages. These updates improve user navigation and provide a more informative interface for managing joins and models. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Ran into this strange and hard to reproduce error with a new version of vite. When switching branches occasionally the frontend fails to load on the local dev server. Kept getting this message: ``` The file does not exist at "/Users/kenmorton/Documents/Code/chronon/frontend/node_modules/.vite/deps/chunk-EPOQRJ6F.js?v=efc5098d" which is in the optimize deps directory. The dependency might be incompatible with the dep optimizer. Try adding it to `optimizeDeps.exclude`. (x11) ``` Found some folks running into the same thing - they recommended [this fix](vitejs/vite#17738 (comment)) and it seems to work so far. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated dependency optimization settings to enhance build performance by excluding the `.vite` directory instead of `node_modules/.cache`. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…rn group across 1 directory (#88) Bumps the npm_and_yarn group with 1 update in the /frontend directory: [@sveltejs/kit](https://github.com/sveltejs/kit/tree/HEAD/packages/kit). Updates `@sveltejs/kit` from 2.6.2 to 2.8.3 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/sveltejs/kit/releases"><code>@sveltejs/kit</code>'s releases</a>.</em></p> <blockquote> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.8.3</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: ensure error messages are escaped (<a href="https://github.com/sveltejs/kit/pull/13050">#13050</a>)</p> </li> <li> <p>fix: escape values included in dev 404 page (<a href="https://github.com/sveltejs/kit/pull/13039">#13039</a>)</p> </li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.8.2</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: prevent duplicate fetch request when using Request with load function's fetch (<a href="https://github.com/sveltejs/kit/pull/13023">#13023</a>)</p> </li> <li> <p>fix: do not override default cookie decoder to allow users to override the <code>cookie</code> library version (<a href="https://github.com/sveltejs/kit/pull/13037">#13037</a>)</p> </li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.8.1</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: only add nonce to <code>script-src-elem</code>, <code>style-src-attr</code> and <code>style-src-elem</code> CSP directives when <code>unsafe-inline</code> is not present (<a href="https://github.com/sveltejs/kit/pull/11613">#11613</a>)</p> </li> <li> <p>fix: support HTTP/2 in dev and production. Revert the changes from <a href="https://github.com/sveltejs/kit/pull/12907">#12907</a> to downgrade HTTP/2 to TLS as now being unnecessary (<a href="https://github.com/sveltejs/kit/pull/12989">#12989</a>)</p> </li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.8.0</h2> <h3>Minor Changes</h3> <ul> <li>feat: add helper to identify <code>ActionFailure</code> objects (<a href="https://github.com/sveltejs/kit/pull/12878">#12878</a>)</li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.7.7</h2> <h3>Patch Changes</h3> <ul> <li>fix: update link in JSDoc (<a href="https://github.com/sveltejs/kit/pull/12963">#12963</a>)</li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.7.6</h2> <h3>Patch Changes</h3> <ul> <li>fix: update broken links in JSDoc (<a href="https://github.com/sveltejs/kit/pull/12960">#12960</a>)</li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.7.5</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: warn on invalid cookie name characters (<a href="https://github.com/sveltejs/kit/pull/12806">#12806</a>)</p> </li> <li> <p>fix: when using <code>@vitejs/plugin-basic-ssl</code>, set a no-op proxy config to downgrade from HTTP/2 to TLS since <code>undici</code> does not yet enable HTTP/2 by default (<a href="https://github.com/sveltejs/kit/pull/12907">#12907</a>)</p> </li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.7.4</h2> <h3>Patch Changes</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/sveltejs/kit/blob/main/packages/kit/CHANGELOG.md"><code>@sveltejs/kit</code>'s changelog</a>.</em></p> <blockquote> <h2>2.8.3</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: ensure error messages are escaped (<a href="https://github.com/sveltejs/kit/pull/13050">#13050</a>)</p> </li> <li> <p>fix: escape values included in dev 404 page (<a href="https://github.com/sveltejs/kit/pull/13039">#13039</a>)</p> </li> </ul> <h2>2.8.2</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: prevent duplicate fetch request when using Request with load function's fetch (<a href="https://github.com/sveltejs/kit/pull/13023">#13023</a>)</p> </li> <li> <p>fix: do not override default cookie decoder to allow users to override the <code>cookie</code> library version (<a href="https://github.com/sveltejs/kit/pull/13037">#13037</a>)</p> </li> </ul> <h2>2.8.1</h2> <h3>Patch Changes</h3> <ul> <li> <p>fix: only add nonce to <code>script-src-elem</code>, <code>style-src-attr</code> and <code>style-src-elem</code> CSP directives when <code>unsafe-inline</code> is not present (<a href="https://github.com/sveltejs/kit/pull/11613">#11613</a>)</p> </li> <li> <p>fix: support HTTP/2 in dev and production. Revert the changes from <a href="https://github.com/sveltejs/kit/pull/12907">#12907</a> to downgrade HTTP/2 to TLS as now being unnecessary (<a href="https://github.com/sveltejs/kit/pull/12989">#12989</a>)</p> </li> </ul> <h2>2.8.0</h2> <h3>Minor Changes</h3> <ul> <li>feat: add helper to identify <code>ActionFailure</code> objects (<a href="https://github.com/sveltejs/kit/pull/12878">#12878</a>)</li> </ul> <h2>2.7.7</h2> <h3>Patch Changes</h3> <ul> <li>fix: update link in JSDoc (<a href="https://github.com/sveltejs/kit/pull/12963">#12963</a>)</li> </ul> <h2>2.7.6</h2> <h3>Patch Changes</h3> <ul> <li>fix: update broken links in JSDoc (<a href="https://github.com/sveltejs/kit/pull/12960">#12960</a>)</li> </ul> <h2>2.7.5</h2> <h3>Patch Changes</h3> <ul> <li>fix: warn on invalid cookie name characters (<a href="https://github.com/sveltejs/kit/pull/12806">#12806</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/sveltejs/kit/commit/429bfb74fe823ea13a5fa0547dcf4cd6bb358a93"><code>429bfb7</code></a> Version Packages (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13049">#13049</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/134e36343ef57ed7e6e2b3bb9e7f05ad37865794"><code>134e363</code></a> fix: ensure error messages are escaped (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13050">#13050</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/d338d4635a7fd947ba5112df6ee632c4a0979438"><code>d338d46</code></a> fix: escape values included in dev 404 page (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13039">#13039</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/5f8399d88fd9461a6111e03e6168067fba42e2c1"><code>5f8399d</code></a> Version Packages (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13024">#13024</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/1358cccd52190df3c74bdd8970dbfb06ffc4ec72"><code>1358ccc</code></a> fix: use default cookie decoder (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13037">#13037</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/570562b74d9e9f295d9b617478088a650f51e96b"><code>570562b</code></a> fix: handle empty Headers when serialising Request passed to fetch (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13023">#13023</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/435984bf61b047d1e1a8efe88354ca7ac4e9109f"><code>435984b</code></a> Version Packages (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12992">#12992</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/0bd4426944ce9995b86199900e39c8d3929fa2f2"><code>0bd4426</code></a> fix: support custom servers using HTTP/2 in production (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12989">#12989</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/6df00fc8448bf72c91e8f6faee0605995b0fdd65"><code>6df00fc</code></a> fix: csp nonce in <code>script-src-elem</code>, <code>style-src-attr</code> and <code>style-src-elem</code> wh...</li> <li><a href="https://github.com/sveltejs/kit/commit/c717db91236c7ab15045b296c73201c6c6ecd6fa"><code>c717db9</code></a> chore: update playground and add an endpoint (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12983">#12983</a>)</li> <li>Additional commits viewable in <a href="https://github.com/sveltejs/kit/commits/@sveltejs/[email protected]/packages/kit">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/zipline-ai/chronon/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ken Morton <[email protected]>
 ### Snyk has created this PR to fix 9 vulnerabilities in the pip dependencies of this project. #### Snyk changed the following file(s): - `docker-init/requirements.txt` <details> <summary>⚠️ <b>Warning</b></summary> ``` oto3 1.28.62 has requirement botocore<1.32.0,>=1.31.62, but you have botocore 1.33.13. ``` </details> --- > [!IMPORTANT] > > - Check the changes in this PR to ensure they won't cause issues with your project. > - Max score is 1000. Note that the real score may have changed since the PR was raised. > - This PR was automatically created by Snyk using the credentials of a real user. > - Snyk has automatically assigned this pull request, [set who gets assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration). > - Some vulnerabilities couldn't be fully fixed and so Snyk will still find them when the project is tested again. This may be because the vulnerability existed within more than one direct dependency, but not all of the affected dependencies could be upgraded. --- **Note:** _You are seeing this because you or someone else with access to this repository has authorized Snyk to open fix PRs._ For more information: <img src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiJjNTE4OGFhMi0yMDI0LTQ0NGItYTFjZi00MWQ2YThmMWE4ZGMiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6ImM1MTg4YWEyLTIwMjQtNDQ0Yi1hMWNmLTQxZDZhOGYxYThkYyJ9fQ==" width="0" height="0"/> 🧐 [View latest project report](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr) 👩💻 [Set who automatically gets assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration) 📜 [Customise PR templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates) 🛠 [Adjust project settings](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr/settings) 📚 [Read about Snyk's upgrade logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities) --- **Learn how to fix vulnerabilities with free interactive lessons:** 🦉 [Improper Input Validation](https://learn.snyk.io/lesson/improper-input-validation/?loc=fix-pr) 🦉 [Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')](https://learn.snyk.io/lesson/directory-traversal/?loc=fix-pr) 🦉 [Cross-site Scripting (XSS)](https://learn.snyk.io/lesson/xss/?loc=fix-pr) [//]: # 'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"aiohttp","from":"3.8.6","to":"3.10.11"},{"name":"zipp","from":"3.15.0","to":"3.19.1"}],"env":"prod","issuesToFix":["SNYK-PYTHON-AIOHTTP-6091621","SNYK-PYTHON-AIOHTTP-6091622","SNYK-PYTHON-AIOHTTP-6209406","SNYK-PYTHON-AIOHTTP-6209407","SNYK-PYTHON-AIOHTTP-6645291","SNYK-PYTHON-AIOHTTP-6808823","SNYK-PYTHON-AIOHTTP-7675597","SNYK-PYTHON-AIOHTTP-8383923","SNYK-PYTHON-ZIPP-7430899"],"prId":"c5188aa2-2024-444b-a1cf-41d6a8f1a8dc","prPublicId":"c5188aa2-2024-444b-a1cf-41d6a8f1a8dc","packageManager":"pip","priorityScoreList":[591,591,616,646,449,589,529,601,666],"projectPublicId":"736d6244-1782-4006-a12e-fdfbd8a4a213","projectUrl":"https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["pr-warning-shown","priorityScore"],"type":"auto","upgrade":[],"vulns":["SNYK-PYTHON-AIOHTTP-6091621","SNYK-PYTHON-AIOHTTP-6091622","SNYK-PYTHON-AIOHTTP-6209406","SNYK-PYTHON-AIOHTTP-6209407","SNYK-PYTHON-AIOHTTP-6645291","SNYK-PYTHON-AIOHTTP-6808823","SNYK-PYTHON-AIOHTTP-7675597","SNYK-PYTHON-AIOHTTP-8383923","SNYK-PYTHON-ZIPP-7430899"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}' Co-authored-by: snyk-bot <[email protected]>
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Updated Docker command syntax for clarity. - Added note on required Docker version (20.10 or higher). <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced logging configuration for Spark sessions to reduce verbosity. - Improved timing and error handling in the data generation script. - New method introduced for alternative streaming data handling in `OnlineUtils`. - Added a demonstration object for observability features in Spark applications. - New configuration file for structured logging setup. - **Bug Fixes** - Adjusted method signatures to ensure clarity and correct parameter usage in various classes. - **Documentation** - Updated import statements to reflect package restructuring for better organization. - Added instructions for building and executing the project in the README. - **Tests** - Integrated `MockApi` into various test classes to enhance testing capabilities and simulate API interactions. - Enhanced test coverage by utilizing the `MockApi` for more robust testing scenarios. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary fixes https://github.com/zipline-ai/chronon/security/dependabot/2 ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated the `frontend` project to include a new dependency on the `cookie` package (version `^0.7.0`). <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Update devnotes based on onboarding doc. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Enhanced setup instructions for the Chronon project. - Expanded prerequisites section with environment variable configurations. - Clarified installation instructions for Thrift, emphasizing version compatibility. - Added guidance for installing Java, Scala, and Python using `asdf`. - Restructured IntelliJ configuration instructions for improved clarity. - Updated troubleshooting section with commands for project cleaning and assembly. - Elaborated on the release process for artifact publishing and code pushing. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary more null handling persisting cardinality map to remove inconsistent compute of cardinality map ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced new constants for `FetchTimeout` (10 minutes) and `DefaultCharset` (UTF-8). - Enhanced the `Summarizer` class to utilize an API for key-value store operations, improving data management. - Updated the `ObservabilityDemo` to include new time series fetching capabilities. - Added a new method `highlight` for string formatting in the `ColorPrinter`. - **Bug Fixes** - Improved null handling in `getSummaries`, `pivot`, `reportKvResponse`, and `multiGet` methods to prevent potential null pointer exceptions. - **Documentation** - Updated logging configuration for enhanced readability and error management. - **Tests** - Increased sample data generation in tests to improve coverage and accuracy. - Enhanced clarity of test setups in `GroupByUploadTest` with better data labeling. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Split `test_scala_and_python.yaml` into python, spark scala, and no spark scala tests. - https://app.asana.com/0/1208785567265389/1208854398566912/f ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced automated testing workflows for Python and Scala modules, enhancing continuous integration. - Added specific workflows for testing non-Spark Scala modules, Spark modules, and formatting checks for Scala files. - **Chores** - Implemented concurrency settings to manage workflow runs and optimize testing efficiency. - Updated logging configuration to reduce verbosity and focus on error messages. - Removed the combined testing workflow for Python and Scala, streamlining the testing process. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Testing this locally: I see no spark logging churn. Looks like it preserves @nikhil-zlai 's PR's behavior: #96 <img width="760" alt="Screenshot 2024-11-26 at 7 55 03 PM" src="https://github.com/user-attachments/assets/844a44e1-c769-4089-b245-a86d138e1d1a"> ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new logging configuration using Log4j2, enhancing logging capabilities and readability. - **Bug Fixes** - Removed outdated logging configuration references, streamlining Docker container execution and Spark application setup. - **Chores** - Updated dependency management to replace Logback with Log4j2 for consistent logging behavior across the project. - Enhanced CI/CD workflows to trigger on changes to the `build.sbt` file, improving responsiveness to updates. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary putting up a simple netty server (no deps added to sbt) and a streamlit app to have a fast loop for iterating / debugging. This is how it looks: <img width="1624" alt="Screenshot 2024-11-26 at 11 03 38 PM" src="https://github.com/user-attachments/assets/d11c8fac-79b7-4749-bba5-e71e09fa0a72"> Updated docs too. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a Streamlit application for visualizing data from an API endpoint. - Added a new HTTP server to handle requests related to drift and summary series, with endpoints for health checks and data retrieval. - **Improvements** - Exposed port 8181 for external access to the Spark application. - Updated the documentation with clearer instructions for building and running the application. - Updated the default value for the start date in configuration settings. - **Bug Fixes** - Enhanced error handling in the data loading process within the Streamlit app. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ontend (#95) ## Summary Builds on a couple of the summary computation PRs and data generation to wire things up so that Hub can serve them. * Yanked out mock data based endpoints (model perf / drift, join & feature skew) - decided it would be confusing to have a mix of mock and generated data so we just have the generated data served * Dropped a few of the scripts introduced in #87. We bring up our containers the way and we have a script `load_summaries.sh` that we can trigger that leverages the existing app container to load data. * DDB ingestion was taking too long and we were dropping a lot of data due to rejected execution exceptions. To unblock for now, we've gone with an approach of making a bulk put HTTP call from the ObservabilityDemo app -> Hub and Hub utilizing a InMemoryKV store to persist and serve up features. * Added an endpoint to serve the join that are configured as we've switched from the model based world. There's still an issue to resolve around fetching individual feature series data. Once I resolve that, we can switch this PR out of wip mode. To test / run: start up our docker containers: ``` $ docker-compose -f docker-init/compose.yaml up --build ... ``` In a different term load data: ``` $ ./docker-init/demo/load_summaries.sh Done uploading summaries! 🥳 ``` You can now curl join & feature time series data. Join drift (null ratios) ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=null&offset=10h&algorithm=psi' ``` Join drift (value drift) ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=10h&algorithm=psi' ``` Feature drift: ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=aggregates' ``` Feature summaries: ``` curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=percentile' ``` Join metadata ``` curl -X GET 'http://localhost:9000/api/v1/joins' curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join' ``` ## Checklist - [X] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced a new `JoinController` for managing joins with pagination support. - Added functionality for an in-memory key-value store with bulk data upload capabilities. - Implemented observability demo data loading within a Spark application. - Added a new `HTTPKVStore` class for remote key-value store interactions over HTTP. - **Improvements** - Enhanced the `ModelController` and `SearchController` to align with the new join data structure. - Updated the `TimeSeriesController` to support asynchronous operations and improved error handling. - Refined dependency management in the build configuration for better clarity and maintainability. - Updated API routes to include new endpoints for listing and retrieving joins. - Updated configuration to replace the `DynamoDBModule` with `ModelStoreModule`, adding `InMemoryKVStoreModule` and `DriftStoreModule`. - **Documentation** - Revised README instructions for Docker container setup and demo data loading. - Updated API routes documentation to reflect new endpoints for joins and in-memory data operations. - **Bug Fixes** - Resolved issues related to error handling in various controllers and improved logging for better traceability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: nikhil-zlai <[email protected]>
## Summary Port of our OSS delta lake PR - airbnb/chronon#869. Largely the same aside from delta lake versions. We don't need this immediately atm but we'll need this if we have other users come along that need delta lake (or we need to add support for formats like hudi) ## Checklist - [X] Added Unit Tests - [X] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added support for Delta Lake operations with new dependencies and configurations. - Introduced new traits and case objects for handling different table formats, enhancing data management capabilities. - Added a new job in the CI workflow for testing Delta Lake format functionality. - **Bug Fixes** - Improved error handling in class registration processes. - **Tests** - Implemented a suite of unit tests for the `TableUtils` class to validate partitioned data insertions with schema modifications. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
 ### Snyk has created this PR to fix 1 vulnerabilities in the pip dependencies of this project. #### Snyk changed the following file(s): - `quickstart/requirements.txt` <details> <summary>⚠️ <b>Warning</b></summary> ``` otebook 6.5.7 requires pyzmq, which is not installed. jupyter-server 1.24.0 requires pyzmq, which is not installed. jupyter-console 6.6.3 requires pyzmq, which is not installed. jupyter-client 7.4.9 requires pyzmq, which is not installed. ipykernel 6.16.2 requires pyzmq, which is not installed. ``` </details> --- > [!IMPORTANT] > > - Check the changes in this PR to ensure they won't cause issues with your project. > - Max score is 1000. Note that the real score may have changed since the PR was raised. > - This PR was automatically created by Snyk using the credentials of a real user. > - Snyk has automatically assigned this pull request, [set who gets assigned](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration). > - Some vulnerabilities couldn't be fully fixed and so Snyk will still find them when the project is tested again. This may be because the vulnerability existed within more than one direct dependency, but not all of the affected dependencies could be upgraded. --- **Note:** _You are seeing this because you or someone else with access to this repository has authorized Snyk to open fix PRs._ For more information: <img src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiIwMzFkYThmYS1hY2ZmLTQ5OTgtOWM3NS04YjhlZDAxNTU1YmUiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6IjAzMWRhOGZhLWFjZmYtNDk5OC05Yzc1LThiOGVkMDE1NTViZSJ9fQ==" width="0" height="0"/> 🧐 [View latest project report](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source=github&utm_medium=referral&page=fix-pr) 👩💻 [Set who automatically gets assigned](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration) 📜 [Customise PR templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates) 🛠 [Adjust project settings](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source=github&utm_medium=referral&page=fix-pr/settings) 📚 [Read about Snyk's upgrade logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities) --- **Learn how to fix vulnerabilities with free interactive lessons:** 🦉 [Regular Expression Denial of Service (ReDoS)](https://learn.snyk.io/lesson/redos/?loc=fix-pr) [//]: # 'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"tornado","from":"6.2","to":"6.4.2"}],"env":"prod","issuesToFix":["SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708"],"prId":"031da8fa-acff-4998-9c75-8b8ed01555be","prPublicId":"031da8fa-acff-4998-9c75-8b8ed01555be","packageManager":"pip","priorityScoreList":[631],"projectPublicId":"e1ca9fce-fa39-4376-afef-0fb43b4e13d3","projectUrl":"https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["updated-fix-title","pr-warning-shown","priorityScore"],"type":"auto","upgrade":[],"vulns":["SNYK-PYTHON-TORNADO-8400708"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}' Co-authored-by: snyk-bot <[email protected]>
… build (#102) ## Summary Speed up our local obs iteration flow by pulling all the forced sbt clean + build steps outside the docker compose build flow. We don't need to build the spark assembly, frontend and hub every time - we often just need to build one / two of these. This PR pulls the build piece out of the docker file so that we don't have to do it every time. Instead we wrap the build in a script and invoke the relevant build targets. The docker file copies the relevant artifacts over. This allows us to do things like: Just build the hub webservice: ``` ./docker-init/build.sh --hub ``` Just build the spark assemblies: ``` ./docker-init/build.sh --spark ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new build script (`build.sh`) for easier module building and management. - **Improvements** - Simplified Dockerfile structure by removing multi-stage builds for both application and frontend. - Updated README to reflect new setup instructions and automated build processes. - Removed unnecessary service dependencies in the Docker Compose configuration. - **Documentation** - Enhanced clarity and detail in README regarding Docker environment setup and usage. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Add a isNumeric field to help the frontend code decide if a time series response for a feature is numeric / categorical. Currently this is a bit hacky and based on the label being a percentile string or not (p0, p10,..). ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced handling of numeric and categorical features in time series data. - Introduced flags to indicate the numeric status of current and baseline series. - **Bug Fixes** - Improved robustness in processing time series data for accurate representation and analysis. - **Documentation** - Updated method signatures to reflect changes in handling numeric features. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
 <h3>Snyk has created this PR to upgrade tailwind-variants from 0.3.0 to 0.3.1.</h3> :information_source: Keep your dependencies up-to-date. This makes it easier to fix existing vulnerabilities and to more quickly identify and fix newly disclosed vulnerabilities when they affect your project. <hr/> - The recommended version is **1 version** ahead of your current version. - The recommended version was released **22 days ago**. <details> <summary><b>Release notes</b></summary> <br/> <details> <summary>Package name: <b>tailwind-variants</b></summary> <ul> <li> <b>0.3.1</b> - <a href="https://github.com/heroui-inc/tailwind-variants/releases/tag/v0.3.1">2025-01-18</a></br><h2>What's Changed</h2> <ul> <li>fix: github workflow by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/tianenpang/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/tianenpang">@ tianenpang</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2652096796" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#222" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/222/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/222">#222</a></li> <li>chore: update repo link & content by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/wingkwong/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/wingkwong">@ wingkwong</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2795337563" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#235" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/235/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/235">#235</a></li> <li>chore: org name change by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jrgarciadev/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jrgarciadev">@ jrgarciadev</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2797166923" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#237" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/237/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/237">#237</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/wingkwong/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/wingkwong">@ wingkwong</a> made their first contribution in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2795337563" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#235" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/235/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/235">#235</a></li> </ul> <p><strong>Full Changelog</strong>: <a class="commit-link" href="https://github.com/heroui-inc/tailwind-variants/compare/v0.3.0...v0.3.1"><tt>v0.3.0...v0.3.1</tt></a></p> </li> <li> <b>0.3.0</b> - <a href="https://github.com/heroui-inc/tailwind-variants/releases/tag/v0.3.0">2024-11-12</a></br><h2>What's Changed</h2> <ul> <li>fix mergeObjects order by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/thefalked/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/thefalked">@ thefalked</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2196305299" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#172" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/172/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/172">#172</a></li> <li>Add ESLint Jest plugin and update ESLint/Prettier by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mskelton/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mskelton">@ mskelton</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2198990776" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#173" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/173/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/173">#173</a></li> <li>fix(transformer): add transformer config type to withTV function by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jonathassardinha/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jonathassardinha">@ jonathassardinha</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2218792265" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#177" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/177/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/177">#177</a></li> <li>docs: add <code>cva</code> to benchmarks by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/mskelton/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/mskelton">@ mskelton</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2229221713" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#178" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/178/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/178">#178</a></li> <li>(fix): responsive variants for base when slots are present by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/w0ofy/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/w0ofy">@ w0ofy</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2357923964" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#202" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/202/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/202">#202</a></li> <li>fix: treat undefined value for compoundVariants as false by <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/Tokky0425/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/Tokky0425">@ Tokky0425</a> in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2459811451" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#210" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/210/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/210">#210</a></li> <li>chore: tailwind-merge updated to v2.5.4</li> </ul> <h2>New Contributors</h2> <ul> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jonathassardinha/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jonathassardinha">@ jonathassardinha</a> made their first contribution in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2218792265" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#177" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/177/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/177">#177</a></li> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/w0ofy/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/w0ofy">@ w0ofy</a> made their first contribution in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2357923964" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#202" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/202/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/202">#202</a></li> <li><a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/Tokky0425/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/Tokky0425">@ Tokky0425</a> made their first contribution in <a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="2459811451" data-permission-text="Title is private" data-url="heroui-inc/tailwind-variants#210" data-hovercard-type="pull_request" data-hovercard-url="/heroui-inc/tailwind-variants/pull/210/hovercard" href="https://github.com/heroui-inc/tailwind-variants/pull/210">#210</a></li> </ul> <p><strong>Full Changelog</strong>: <a class="commit-link" href="https://github.com/heroui-inc/tailwind-variants/compare/v0.2.1...v0.3.0"><tt>v0.2.1...v0.3.0</tt></a></p> </li> </ul> from <a href="https://github.com/heroui-inc/tailwind-variants/releases">tailwind-variants GitHub release notes</a> </details> </details> --- > [!IMPORTANT] > > - Check the changes in this PR to ensure they won't cause issues with your project. > - This PR was automatically created by Snyk using the credentials of a real user. > - Snyk has automatically assigned this pull request, [set who gets assigned](/settings/integration). --- **Note:** _You are seeing this because you or someone else with access to this repository has authorized Snyk to open upgrade PRs._ **For more information:** <img src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiJiNGU0NzAwMS0yY2IyLTRkZjItYmZiZS0wMTJlNmYyOWNhYmIiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6ImI0ZTQ3MDAxLTJjYjItNGRmMi1iZmJlLTAxMmU2ZjI5Y2FiYiJ9fQ==" width="0" height="0"/> > - 🧐 [View latest project report](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source=github&utm_medium=referral&page=upgrade-pr) > - 👩💻 [Set who automatically gets assigned](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source=github&utm_medium=referral&page=upgrade-pr/) > - 📜 [Customise PR templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates?utm_source=&utm_content=fix-pr-template) > - 🛠 [Adjust upgrade PR settings](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source=github&utm_medium=referral&page=upgrade-pr) > - 🔕 [Ignore this dependency or unsubscribe from future upgrade PRs](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?pkg=tailwind-variants&utm_source=github&utm_medium=referral&page=upgrade-pr#auto-dep-upgrades) [//]: # 'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"tailwind-variants","from":"0.3.0","to":"0.3.1"}],"env":"prod","hasFixes":false,"isBreakingChange":false,"isMajorUpgrade":false,"issuesToFix":[],"prId":"b4e47001-2cb2-4df2-bfbe-012e6f29cabb","prPublicId":"b4e47001-2cb2-4df2-bfbe-012e6f29cabb","packageManager":"npm","priorityScoreList":[],"projectPublicId":"f4bdc116-d05b-4937-96b5-b1f9a02872e5","projectUrl":"https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source=github&utm_medium=referral&page=upgrade-pr","prType":"upgrade","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":[],"type":"auto","upgrade":[],"upgradeInfo":{"versionsDiff":1,"publishedDate":"2025-01-18T20:27:59.252Z"},"vulns":[]}' Co-authored-by: snyk-bot <[email protected]>
## Summary Quick change to standardize scroll styles across the app ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Style - Enhanced the appearance of scrollbars with updated styling and customizable color options. - Refactor - Simplified scroll behavior by replacing custom scrolling components with standard, CSS-managed scroll containers across the layout and key content areas. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
isEmpty is somewhat expensive operation as it needs a partial table
scan. For the most part in joins we allow for empty dataframes, so we
can optimize the common path.
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Summary by CodeRabbit
- **Refactor**
- Refined internal logic to streamline condition evaluations and
consolidated diagnostic messaging for more effective system monitoring.
These optimizations simplify internal processing while ensuring a
consistent user experience with no visible changes to public features.
Enhanced logging now provides improved insights into system operations
without impacting functionality. This update improves overall system
efficiency and clarity.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
Co-authored-by: Thomas Chow <[email protected]>
## Summary This PR wires up tiling support. Covers a few aspects: * BigTable KV store changes to support tiling - we take requests for the '_STREAMING' table for gets and puts using the TileKey thrift interface and map to corresponding BT RowKey + timerange lookups. We've yanked out event based support in the BT kv store. We're writing out data in the Row + tile format documented here - [Option 1 - Tiles as Timestamped Rows](https://docs.google.com/document/d/1wgzJVAkl5K1bBCr98WCZFiFeTTWqILdA3FTE7cz9Li4/edit?tab=t.0#bookmark=id.j54a5g8gj2m9). * Add a Flag in the FlagStore to indicate if we're using Tiling / not. Switched over the fetcher checks to use this instead of the prior GrpByServingInfo.isTilingEnabled flag. Leverage this flag in Flink to choose tiling / not. Set this flag to true in the GcpApi to always use tiling. ## Checklist - [X] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - Tested on the Etsy side by running the job, hitting some fetcher cli endpoints. - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced dynamic tiling capabilities for time series and streaming data processing. This enhancement enables a configurable tiled data mode that improves data retrieval granularity, processing consistency, and overall query performance, resulting in more efficient and predictable operations for end-users. - Added new methods for constructing tile keys and row keys, enhancing data management capabilities. - Implemented flag-based control for enabling or disabling tiling in various components, allowing for more flexible configurations. - **Bug Fixes** - Corrected minor documentation errors in the FlagStore interface. - **Tests** - Expanded test coverage to validate new tiling functionalities and ensure robustness in handling time series data. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: tchow-zlai <[email protected]> Co-authored-by: Thomas Chow <[email protected]>
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Removed the legacy transaction and risk analysis view to streamline the user interface. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Couple of changes to get my feet wet with LayerChart and bring the chart styles closer to what they were using ECharts. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced chart interactivity with improved tooltip behavior and legend styling for a smoother, more engaging visualization experience. - Added customizable options for axis configurations and highlighted points, allowing for a more refined display of data trends. - **Chores** - Updated a core charting dependency to its latest version, contributing to improved performance and stability. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced table output handling to support partitioned tables.
- Introduced configurable options for temporary storage and integration
settings, improving cloud-based table materialization.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: Thomas Chow <[email protected]>
0e9ac8a to
0ee77c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
spark/src/main/scala/ai/chronon/spark/DerivationJob.scala (2)
27-32:⚠️ Potential issueAdd parameter validation and replace table name placeholders.
Validate input parameters and define actual table names.
def fromJoin(join: api.Join, dateRange: PartitionRange): DerivationJob = { + require(join != null, "join cannot be null") + require(dateRange != null, "dateRange cannot be null") + require(join.derivations != null, "derivations cannot be null") val baseOutputTable = "TODO" // Output of the base Join pre-derivation val finalOutputTable = "TODO" // The actual output table val derivations = join.derivations.asScala📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.def fromJoin(join: api.Join, dateRange: PartitionRange): DerivationJob = { + require(join != null, "join cannot be null") + require(dateRange != null, "dateRange cannot be null") + require(join.derivations != null, "derivations cannot be null") val baseOutputTable = "TODO" // Output of the base Join pre-derivation val finalOutputTable = "TODO" // The actual output table val derivations = join.derivations.asScala new DerivationJob(baseOutputTable, finalOutputTable, derivations, dateRange) }
20-25:⚠️ Potential issueAdd parameter validation and replace table name placeholders.
Validate input parameters and define actual table names.
def fromGroupBy(groupBy: api.GroupBy, dateRange: PartitionRange): DerivationJob = { + require(groupBy != null, "groupBy cannot be null") + require(dateRange != null, "dateRange cannot be null") + require(groupBy.derivations != null, "derivations cannot be null") val baseOutputTable = "TODO" // Output of the base GroupBy pre-derivation val finalOutputTable = "TODO" // The actual output table val derivations = groupBy.derivations.asScala📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.def fromGroupBy(groupBy: api.GroupBy, dateRange: PartitionRange): DerivationJob = { require(groupBy != null, "groupBy cannot be null") require(dateRange != null, "dateRange cannot be null") require(groupBy.derivations != null, "derivations cannot be null") val baseOutputTable = "TODO" // Output of the base GroupBy pre-derivation val finalOutputTable = "TODO" // The actual output table val derivations = groupBy.derivations.asScala new DerivationJob(baseOutputTable, finalOutputTable, derivations, dateRange) }
🧹 Nitpick comments (5)
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)
69-74: Use constants for null values.Replace hardcoded nulls with constants.
- val nulls = Seq("null", "Null", "NULL") + private val NULL_VALUES = Seq("null", "Null", "NULL") + def generateSkewFilterSql(key: String, values: Seq[String]): String = { + val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(NULL_VALUES.contains).mkString(", ")})") + val nullFilters = if (values.exists(NULL_VALUES.contains)) Some(s"$key IS NOT NULL") else Nonespark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1)
72-119: Consider optimizing join operations.The fold operation with joins could be optimized using broadcast joins for small tables.
spark/src/main/scala/ai/chronon/spark/Join.scala (3)
301-328: Remove commented code.Good refactoring moving join part logic to a dedicated class. However, remove the commented-out code before finalizing the PR.
- // val df = - // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode) - // .map(df => joinPart -> df) -
301-304: Remove commented out code.Delete the commented out code as it's no longer needed.
- // val df = - // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode) - // .map(df => joinPart -> df) -
301-328: Remove commented out code.Since the code has been refactored to use
JoinPartJob, the commented out code can be safely removed.Apply this diff to remove the commented out code:
- // val df = - // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode) - // .map(df => joinPart -> df) -
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (9)
api/thrift/orchestration.thrift(1 hunks)orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/DerivationJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/Join.scala(2 hunks)spark/src/main/scala/ai/chronon/spark/JoinBase.scala(2 hunks)spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/JoinUtils.scala(5 hunks)spark/src/main/scala/ai/chronon/spark/SourceJob.scala(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala
- spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala
- spark/src/main/scala/ai/chronon/spark/JoinUtils.scala
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: spark_tests
- GitHub Check: non_spark_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (11)
spark/src/main/scala/ai/chronon/spark/JoinBase.scala (6)
174-175: LGTM! Good refactoring.Moving bootstrap logic to a dedicated class improves modularity.
296-296: LGTM! Good refactoring.Moving small mode determination to
JoinUtilsimproves reusability.
174-175: LGTM! Good refactoring.Moving bootstrap logic to a dedicated class improves modularity.
296-296: LGTM! Good refactoring.Centralizing small mode determination logic in
JoinUtilsimproves reusability.
174-175: LGTM! Good refactoring.Moving bootstrap logic to a dedicated class improves code organization.
296-296: LGTM! Good refactoring.Moving small mode determination to
JoinUtilsimproves code reusability.spark/src/main/scala/ai/chronon/spark/Join.scala (5)
222-223: LGTM! Consistent refactoring.Changes align with the bootstrap logic refactoring in
JoinBase.scala.
222-223: LGTM! Consistent refactoring.Moving bootstrap logic to a dedicated class is consistent with changes in
JoinBase.scala.
305-328: LGTM! Good refactoring.Moving join part logic to a dedicated class with proper context improves modularity.
1-1: Add missing items before merge.As indicated in the PR description, please ensure the following are completed before merge:
- Add unit tests
- Ensure CI coverage
- Perform integration testing
- Update documentation
222-223: LGTM! Good refactoring.Moving bootstrap logic to a dedicated class improves code organization.
| outputTable: String, | ||
| derivations: Seq[api.Derivation], | ||
| dateRange: PartitionRange) { | ||
| def run(): Unit = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the run method.
Empty implementation needs to be completed.
| val timeProjection = if (source.dataModel == Events) { | ||
| Seq(Constants.TimeColumn -> Option(source.query).map(_.timeColumn).orNull) | ||
| } else { | ||
| Seq() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add null check for source.
Prevent potential NPE.
- val timeProjection = if (source.dataModel == Events) {
+ val timeProjection = if (source != null && source.dataModel == Events) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| val timeProjection = if (source.dataModel == Events) { | |
| Seq(Constants.TimeColumn -> Option(source.query).map(_.timeColumn).orNull) | |
| } else { | |
| Seq() | |
| } | |
| val timeProjection = if (source != null && source.dataModel == Events) { | |
| Seq(Constants.TimeColumn -> Option(source.query).map(_.timeColumn).orNull) | |
| } else { | |
| Seq() | |
| } |
| def run(): Unit = { | ||
| // Runs the bootstrap query and produces an output table specific to the `left` side of the Join | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the run method.
Empty implementation needs to be completed.
| // TODO: allow customization of deduplication logic | ||
| .dropDuplicates(part.keys(join, tableUtils.partitionColumn).toArray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Address TODO about deduplication logic customization.
Implement customizable deduplication strategy.
🏁 Script executed:
#!/bin/bash
# Check if there are any existing customization patterns in the codebase
rg -A 5 "deduplication.*logic" Length of output: 826
Custom deduplication logic is not implemented.
- The TODO remains, and the code still hard-codes
.dropDuplicates(...)without a customization hook.
| struct SourceWithFilter { | ||
| 1: optional api.Source source | ||
| 2: optional map<string,list<string>> excludeKeys | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add documentation for SourceWithFilter struct.
Add Thrift documentation comments explaining the purpose and usage of this struct.
+/**
+ * Represents a source with optional key exclusion filters.
+ * @param source The source to filter
+ * @param excludeKeys Map of column names to values that should be excluded
+ */
struct SourceWithFilter {
}Committable suggestion skipped: line range outside the PR's diff.
- This is a better failure mode, we don't want to continue if theres'
something that's happening in the ananalysis phase.
## Summary
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Bug Fixes**
- Refined error handling mechanism in join computation process
- Improved exception propagation during unexpected errors
The changes focus on streamlining error management with a more direct
approach to handling unexpected exceptions during join operations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: Thomas Chow <[email protected]>
## Summary - bulked out eval to run sources inside join / group_by etc. - removed need for separate gateway setup and maintenance. - added support for sampling dependent tables to local_warehouse. - deleted some dead code. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested (on etsy confs) - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved error feedback clarity during data sampling. - **New Features** - Increased data sampling limits for improved performance. - Enhanced SQL query handling with new date filtering conditions. - **Refactor** - Streamlined SQL query generation for table scans, ensuring valid queries under various conditions. - Deprecated outdated sampling functionality to enhance overall maintainability. - **Chores** - Disabled unnecessary operations in the build and upload script for Google Cloud Storage. - **Style** - Added logging for improved traceability of filtering conditions in DataFrame scans. - **Tests** - Removed unit tests for the Flow and Node classes. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced data processing by introducing new configuration options for
writing data, including support for Parquet as the intermediate format
and enabling list inference during write operations.
- Expanded selection of fields in purchase events with the addition of
`bucket_rand`.
- Introduced a new aggregation to calculate the last 15 purchase prices,
utilizing the newly added `bucket_rand` field.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: Thomas Chow <[email protected]>
## Summary ^^^ ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced logging now delivers color-coded outputs and adjusted log levels for clearer visibility. - Upgraded service versioning supports stable, production-ready deployments. - **Chores** - Modernized the build and deployment pipeline to improve artifact handling. - Refined dependency management to bolster advanced logging capabilities. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Adds support for creating a new `.bazelrc.local` file specifying custom build/test bazel options which can be used for passing gcloud auth credentials ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated the build configuration to optionally load a user-specific settings file, replacing the automatic use of preset credentials. - **Documentation** - Enhanced guidance with a new section detailing steps for setting up personal authentication credentials. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Modified our github workflow to run scalaFmt checks using bazel instead of sbt and deleted the build.sbt file as it's no longer needed now. ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Streamlined build and continuous integration setups, transitioning away from legacy tooling. - Modernized internal infrastructure for improved consistency and stability. - **Refactor / Style** - Enhanced code readability with comprehensive cosmetic and documentation updates. - Unified formatting practices across the codebase to support future maintainability. - Adjusted formatting of comments and code blocks for improved clarity without altering functionality. - **Tests** - Reformatted test suites for clarity and consistency while preserving all functional behaviors. - Improved formatting in various test cases and methods for better readability without altering functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Based on Slack [discussion](https://zipline-2kh4520.slack.com/archives/C0880ECQ0EN/p1739304132253249) ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- av pr metadata This information is embedded by the av CLI when creating PRs to track the status of stacks when using Aviator. Please do not delete or edit this section of the PR. ``` {"parent":"main","parentHead":"","trunk":"main"} ``` --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced an optional attribute to enhance node classification with more detailed physical characteristics for improved metadata representation. <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: Sean Lynch <[email protected]>
## Summary Updated zpush script with bazel scalafmt in our dev notes. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Enhanced guidelines for formatting and pushing Scala code. - Replaced previous procedures with an updated method featuring detailed error notifications. - Clarified the need for quoting multi-word commit messages. - Adjusted the ordering of remote connectivity instructions for improved clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
bd4e828 to
b4bd285
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)
24-28:⚠️ Potential issueAdd null check for source.
Prevent potential NPE by checking if source is null before accessing dataModel.
- val timeProjection = if (source.dataModel == Events) { + val timeProjection = if (source != null && source.dataModel == Events) {
🧹 Nitpick comments (5)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (2)
446-470: Add input validation for skewKeys.Consider validating that skewKeys values are non-empty when present.
def skewFilter(keys: Option[Seq[String]] = None, skewKeys: Option[Map[String, Seq[String]]], leftKeyCols: Seq[String], joiner: String = " OR "): Option[String] = { + require(skewKeys.forall(_.values.forall(_.nonEmpty)), "skewKeys values must not be empty") skewKeys.map { keysMap =>
521-535: Extract magic numbers into constants.The threshold count appears in multiple log messages. Consider extracting it to avoid repetition.
+ private val SmallModeLogMsg = "Counted %d rows, running join in small mode." + private val NormalModeLogMsg = "Counted greater than %d rows, proceeding with normal computation." def runSmallMode(tableUtils: TableUtils, leftDf: DataFrame): Boolean = { if (tableUtils.smallModelEnabled) { val thresholdCount = leftDf.limit(Some(tableUtils.smallModeNumRowsCutoff + 1).get).count() val result = thresholdCount <= tableUtils.smallModeNumRowsCutoff if (result) { - logger.info(s"Counted $thresholdCount rows, running join in small mode.") + logger.info(SmallModeLogMsg.format(thresholdCount)) } else { - logger.info( - s"Counted greater than ${tableUtils.smallModeNumRowsCutoff} rows, proceeding with normal computation.") + logger.info(NormalModeLogMsg.format(tableUtils.smallModeNumRowsCutoff)) }spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)
47-49: Enhance error message for empty DataFrame.Add more context to help debug empty result sets.
- throw new RuntimeException(s"Query produced 0 rows in range $range.") + throw new RuntimeException(s"Query for source ${source.table} produced 0 rows in range $range. Please verify the source data and filters.")spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1)
191-195: Enhance error handling with specific error types.Current error handling is too generic.
- } catch { - case e: Exception => - logger.error(s"Error while processing groupBy: ${joinPart.groupBy.getMetaData.getName}") - throw e + } catch { + case e: IllegalArgumentException => + logger.error(s"Invalid arguments while processing groupBy: ${joinPart.groupBy.getMetaData.getName}", e) + throw e + case e: Exception => + logger.error(s"Unexpected error while processing groupBy: ${joinPart.groupBy.getMetaData.getName}", e) + throw new RuntimeException(s"Bootstrap computation failed: ${e.getMessage}", e)spark/src/main/scala/ai/chronon/spark/Join.scala (1)
343-345: Remove commented code.Clean up commented code that's no longer needed.
- // val df = - // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode) - // .map(df => joinPart -> df)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (9)
api/thrift/orchestration.thrift(1 hunks)orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/DerivationJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/Join.scala(4 hunks)spark/src/main/scala/ai/chronon/spark/JoinBase.scala(2 hunks)spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/JoinUtils.scala(5 hunks)spark/src/main/scala/ai/chronon/spark/SourceJob.scala(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- api/thrift/orchestration.thrift
- orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala
- spark/src/main/scala/ai/chronon/spark/DerivationJob.scala
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: spark_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: non_spark_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (11)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (4)
308-313: LGTM! Good refactoring.The parameter change from
joinConftoleftDataModelimproves method focus and reduces coupling.
495-500: LGTM! Robust null handling.The method correctly handles both null and non-null cases in the SQL filter generation.
502-519: LGTM! Good early returns.The method has clear early return conditions and proper null checks.
537-542: LGTM! Clean conversion utility.The method is focused and handles the conversion well.
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)
69-74: LGTM! Robust SQL filter generation.The SQL generation handles both null and non-null cases correctly.
spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (2)
117-118: Implement customizable deduplication logic.The TODO indicates a need for flexible deduplication strategies.
Would you like me to propose a design for customizable deduplication logic?
148-177: LGTM! Robust external field padding.The implementation correctly handles both contextual and non-contextual fields.
spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala (1)
289-302: LGTM! Comprehensive join logic.The pattern matching covers all combinations of data models and accuracies correctly.
spark/src/main/scala/ai/chronon/spark/JoinBase.scala (2)
174-175: LGTM! Clean bootstrap job integration.Good modularization of bootstrap logic into a separate job class.
295-295: LGTM! Centralized small mode determination.Good refactoring to use JoinUtils for consistent small mode logic.
spark/src/main/scala/ai/chronon/spark/Join.scala (1)
347-370: LGTM! Good refactoring to improve modularity.The introduction of
JoinPartJobContextandJoinPartJobimproves code organization and maintainability.
| JoinUtils.skewFilter(Option(joinPart.rightToLeft.values.toSeq), skewKeys, joinPart.rightToLeft.values.toSeq) | ||
| // this is the second time we apply skew filter - but this filters only on the keys | ||
| // relevant for this join part. | ||
| println("leftSkewFilter: " + leftSkewFilter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace println with logger.
Use proper logging instead of println for debugging.
- println("leftSkewFilter: " + leftSkewFilter)
+ logger.debug(s"leftSkewFilter: $leftSkewFilter")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| println("leftSkewFilter: " + leftSkewFilter) | |
| logger.debug(s"leftSkewFilter: $leftSkewFilter") |
| // if the join part contains ChrononRunDs macro, then we need to make sure the join is for a single day | ||
| val selects = Option(joinPart.groupBy.sources.toScala.map(_.query.selects).map(_.toScala)) | ||
| if ( | ||
| selects.isDefined && selects.get.nonEmpty && selects.get.exists(selectsMap => | ||
| Option(selectsMap).isDefined && selectsMap.values.exists(_.contains(Constants.ChrononRunDs))) | ||
| ) { | ||
| assert( | ||
| leftRange.isSingleDay, | ||
| s"Macro ${Constants.ChrononRunDs} is only supported for single day join, current range is $leftRange") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicated code block.
This block is an exact duplicate of the code above at lines 318-324.
- // if the join part contains ChrononRunDs macro, then we need to make sure the join is for a single day
- val selects = Option(joinPart.groupBy.sources.toScala.map(_.query.selects).map(_.toScala))
- if (
- selects.isDefined && selects.get.nonEmpty && selects.get.exists(selectsMap =>
- Option(selectsMap).isDefined && selectsMap.values.exists(_.contains(Constants.ChrononRunDs)))
- ) {
- assert(
- leftRange.isSingleDay,
- s"Macro ${Constants.ChrononRunDs} is only supported for single day join, current range is $leftRange")
- }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // if the join part contains ChrononRunDs macro, then we need to make sure the join is for a single day | |
| val selects = Option(joinPart.groupBy.sources.toScala.map(_.query.selects).map(_.toScala)) | |
| if ( | |
| selects.isDefined && selects.get.nonEmpty && selects.get.exists(selectsMap => | |
| Option(selectsMap).isDefined && selectsMap.values.exists(_.contains(Constants.ChrononRunDs))) | |
| ) { | |
| assert( | |
| leftRange.isSingleDay, | |
| s"Macro ${Constants.ChrononRunDs} is only supported for single day join, current range is $leftRange") | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (2)
14-17: Enhance class documentation.Add parameter descriptions and example usage.
/* Runs and materializes a `Source` for a given `dateRange`. Used in the Join computation flow to first compute the Source, then each join may have a further Bootstrap computation to produce the left side to effectively use. + + @param sourceWithFilter Source configuration with skew filtering options + @param tableUtils Utility for table operations + @param range Partition range to process + + Example: + {{{ + val job = new SourceJob(source, utils, range) + job.run() + }}} */
51-53: Resolve design decision and add error handling.The comment suggests uncertainty about parameter design. Also, add error handling for digest generation.
- f"${source.table}_${ThriftJsonCodec.md5Digest(sourceWithFilter)}" // Or should we pass this in as an arg? + f"${source.table}_${Try(ThriftJsonCodec.md5Digest(sourceWithFilter)) + .getOrElse(throw new RuntimeException("Failed to generate output name digest"))}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (3)
spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala(1 hunks)spark/src/main/scala/ai/chronon/spark/Join.scala(4 hunks)spark/src/main/scala/ai/chronon/spark/SourceJob.scala(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: non_spark_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: spark_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (4)
spark/src/main/scala/ai/chronon/spark/Join.scala (3)
325-334: Remove duplicated code block.This block is identical to lines 318-324.
347-369: LGTM! Good modularization with JoinPartJob.The refactoring improves code organization by encapsulating join part processing logic.
354-365: LGTM! Clear configuration handling.Good separation of concerns with skew handling and bootstrap table configurations.
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)
24-28: Add null check for source.Prevent potential NPE.
- val timeProjection = if (source.dataModel == Events) { + val timeProjection = if (source != null && source.dataModel == Events) {
| def generateSkewFilterSql(key: String, values: Seq[String]): String = { | ||
| val nulls = Seq("null", "Null", "NULL") | ||
| val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})") | ||
| val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None | ||
| (nonNullFilters ++ nullFilters).mkString(" AND ") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prevent SQL injection in filter generation.
Sanitize input values before using in SQL.
def generateSkewFilterSql(key: String, values: Seq[String]): String = {
val nulls = Seq("null", "Null", "NULL")
- val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})")
+ val sanitizedValues = values.filterNot(nulls.contains).map(_.replaceAll("[^a-zA-Z0-9_]", ""))
+ val nonNullFilters = Some(s"$key NOT IN (${sanitizedValues.mkString(", ")})")
val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None
(nonNullFilters ++ nullFilters).mkString(" AND ")
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def generateSkewFilterSql(key: String, values: Seq[String]): String = { | |
| val nulls = Seq("null", "Null", "NULL") | |
| val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})") | |
| val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None | |
| (nonNullFilters ++ nullFilters).mkString(" AND ") | |
| } | |
| def generateSkewFilterSql(key: String, values: Seq[String]): String = { | |
| val nulls = Seq("null", "Null", "NULL") | |
| val sanitizedValues = values.filterNot(nulls.contains).map(_.replaceAll("[^a-zA-Z0-9_]", "")) | |
| val nonNullFilters = Some(s"$key NOT IN (${sanitizedValues.mkString(", ")})") | |
| val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None | |
| (nonNullFilters ++ nullFilters).mkString(" AND ") | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (4)
447-470: Add input validation for empty sequences.Consider adding validation for empty sequences in
keysandleftKeyColsparameters.def skewFilter(keys: Option[Seq[String]] = None, skewKeys: Option[Map[String, Seq[String]]], leftKeyCols: Seq[String], joiner: String = " OR "): Option[String] = { + require(leftKeyCols.nonEmpty, "leftKeyCols cannot be empty") + keys.foreach(k => require(k.nonEmpty, "keys cannot be empty when provided")) skewKeys.map { keysMap =>
494-499: Extract null values as constants.Move magic strings for null values to constants.
+ private val NullValues = Set("null", "Null", "NULL") private def generateSkewFilterSql(key: String, values: Seq[String]): String = { - val nulls = Seq("null", "Null", "NULL") val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})") - val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None + val nullFilters = if (values.exists(NullValues.contains)) Some(s"$key IS NOT NULL") else None
501-518: Add method documentation.Add ScalaDoc to explain the purpose and parameters of
findUnfilledRecords.+ /** + * Identifies unfilled records in a bootstrap DataFrame based on covering sets. + * + * @param bootstrapDfWithStats The bootstrap DataFrame with statistics + * @param coveringSets The sequence of covering sets to check against + * @return Option[DfWithStats] containing unfilled records if any exist + */ def findUnfilledRecords(bootstrapDfWithStats: DfWithStats, coveringSets: Seq[CoveringSet]): Option[DfWithStats] = {
520-534: Extract magic numbers in logging.Move the cutoff value to a constant to avoid repetition.
def runSmallMode(tableUtils: TableUtils, leftDf: DataFrame): Boolean = { if (tableUtils.smallModelEnabled) { - val thresholdCount = leftDf.limit(Some(tableUtils.smallModeNumRowsCutoff + 1).get).count() + val cutoff = tableUtils.smallModeNumRowsCutoff + val thresholdCount = leftDf.limit(Some(cutoff + 1).get).count() val result = thresholdCount <= tableUtils.smallModeNumRowsCutoff
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: non_spark_tests
- GitHub Check: spark_tests
- GitHub Check: enforce_triggered_workflows
🔇 Additional comments (2)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (2)
309-346: LGTM! Good refactoring.Simplified method signature by requiring only the necessary data model information.
536-540: LGTM! Clean implementation.Simple and effective conversion from Java to Scala collections.
Summary
Checklist
Summary by CodeRabbit
New Features
Bootstrap,BootstrapJob,JoinPartJob,DerivationJob, andSourceJob, enhancing the data processing capabilities.SourceWithFilterto enhance data modeling capabilities within the orchestration API.Bug Fixes
Refactor