Skip to content

Conversation

@varant-zlai
Copy link
Collaborator

@varant-zlai varant-zlai commented Jan 31, 2025

Summary

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

  • New Features

    • Introduced new classes for managing join operations, including Bootstrap, BootstrapJob, JoinPartJob, DerivationJob, and SourceJob, enhancing the data processing capabilities.
    • Added functionalities for generating bootstrap tables, handling derivation jobs, and managing skew filtering, improving the efficiency of join operations.
    • Added a new structure SourceWithFilter to enhance data modeling capabilities within the orchestration API.
  • Bug Fixes

    • Improved error handling and logging during join processing to ensure robustness.
  • Refactor

    • Streamlined existing join logic by removing outdated methods and consolidating functionalities, enhancing maintainability and clarity.

chewy-zlai and others added 30 commits November 21, 2024 10:32
… elastic search. (#65)

## Summary
Adds a temporal service with temporal admin tools, a temporal ui, and
elastic search to the Docker PoC setup.
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [x] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced new services: MySQL, Temporal, Temporal Admin Tools, and
Temporal UI.
- Added a new network, Temporal Network, to enhance service
communication.
- **Changes**
	- Adjusted port mapping for the Spark service to improve accessibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

- https://app.asana.com/0/1208785567265389/1208812512114700

- This PR addresses some flaky unit test behavior that we've been
observing in the zipline fork. See:
https://zipline-2kh4520.slack.com/archives/C072LUA50KA/p1732043073171339?thread_ts=1732042778.209419&cid=C072LUA50KA

- A previous [CI
test](https://github.com/zipline-ai/chronon/actions/runs/11946764068/job/33301642119?pr=72
) shows that `other_spark_tests` intermittently fails due to a couple
reasons. This PR addresses the flakiness of [FeatureWithLabelJoinTest
.testFinalViewsWithAggLabel](
https://github.com/zipline-ai/chronon/blob/6cb6273551e024d6eecb068f754b510ae0aac464/spark/src/test/scala/ai/chronon/spark/test/FeatureWithLabelJoinTest.scala#L118),
where sometimes the test assertion fails with an unexpected result
value.

### Synopsis

Looks like during a rewrite/refactoring of the code, we did not preserve
the functionality. The diff starts to happen at the time of computing
label joins per partition range, in particular when we materialize the
label join and [scan it
back](https://github.com/zipline-ai/chronon/blob/b64f44d57c90367ccfcb5d5c96327a1ef820e2b3/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L200).

In the OSS version, the
[scan](https://github.com/airbnb/chronon/blob/6968c5c29b6e48867f8c08f2b9b8281f09d47c16/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L192-L193)
applies a [partition
filter](https://github.com/airbnb/chronon/blob/6968c5c29b6e48867f8c08f2b9b8281f09d47c16/spark/src/main/scala/ai/chronon/spark/DataRange.scala#L102-L104).
We dropped these partition filters during the
[refactoring](c6a377c#diff-57b1d6132977475fa0e87a71f017e66f4a7c94f466f911b33e9178598c6c058dL97-R102)
on Zipline side. As such, the physical plans produced by these two scans
are different:

```
// Zipline
== Physical Plan ==
*(1) ColumnarToRow
+- FileScan parquet spark_catalog.final_join.label_agg_table_listing_labels_agg[listing#53934L,is_active_max_5d#53935,label_ds#53936] Batched: true, DataFilters: [], Format: Parquet, Location: CatalogFileIndex(1 paths)[file:/tmp/chronon/spark-warehouse_6fcd3d/data/final_join.db/label_agg_t..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<listing:bigint,is_active_max_5d:int>
```

```
// OSS
== Physical Plan ==
Coalesce 1000
+- *(1) ColumnarToRow
   +- FileScan parquet final_join_xggqlu.label_agg_table_listing_labels_agg[listing#50981L,is_active_max_5d#50982,label_ds#50983] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/tmp/chronon/spark-warehouse_69002f/data/final_join_xggqlu.db/label_agg_ta..., PartitionFilters: [isnotnull(label_ds#50983), (label_ds#50983 >= 2022-10-07), (label_ds#50983 <= 2022-10-07)], PushedFilters: [], ReadSchema: struct<listing:bigint,is_active_max_5d:int>
```

Note that OSS has a non-empty partition filter: `PartitionFilters:
[isnotnull(label_ds#50983), (label_ds#50983 >= 2022-10-07),
(label_ds#50983 <= 2022-10-07)]` where Zipline does not.

The fix is to add these partition filters back, as done in this PR. 





~### Abandoned Investigation~

~It looks like there is some non-determinism computing one of the
intermittent dataframes when computing label joins.
[`dropDuplicates`](https://github.com/zipline-ai/chronon/blob/6cb6273551e024d6eecb068f754b510ae0aac464/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L215)
seems to be operating on a row compound key `rowIdentifier`, which
doesn't produce deterministic results. As such we sometimes lose the
expected values. This
[change](https://github.com/airbnb/chronon/pull/380/files#diff-2c74cac973e1af38b615f654fee5b0261594a2b0005ecfd5a8f0941b8e348eedR156)
was introduced in OSS upstream almost 2 years ago. This
[test](airbnb/chronon#435) was contributed a
couple months after .~


~See debugger local values comparison. The left side is test failure,
and right side is test success.~


~<img width="1074" alt="Screenshot 2024-11-21 at 9 26 04 AM"
src="https://github.com/user-attachments/assets/0eba555c-43ab-48a6-bf61-bbb7b4fa2445">~


~Removing the `dropDuplicates` call will allow the tests to pass.
However, unclear if this will produce the semantically correct behavior,
as the tests themselves seem~

## Checklist
- [x] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Reintroduced a testing method to validate label joins, ensuring
accuracy in data processing.

- **Improvements**
- Enhanced data retrieval logic for label joins, emphasizing unique
entries and clearer range specifications.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

- Load local resources irrespective of where the tests are currently
being run from. This allows us to run them from Intellij.
- 
## Checklist
- [x] Added Unit Tests
- [x] Covered by existing CI
- [x] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
- Improved test robustness by replacing hardcoded file paths with
dynamic resource URI retrieval for loading test data.
  
- **Tests**
- Enhanced flexibility in test cases for locating resources, ensuring
consistent access regardless of the working directory.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
## Summary
Switches to create-summary-dataset, and provides conf-path to summarize-and-upload
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Updated command for creating summary datasets to improve clarity and
functionality.
	- Enhanced configuration handling for summary data uploads.

- **Bug Fixes**
- Maintained consistent error handling to ensure reliability during
execution.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

- Some of the `logger.info` invocations weren't happening. That's mainly
because we don't have a logger implementation specified at least in the
spark build. This PR adds a `logback` implementation and a basic
configuration as such.

- This will allow us to see log messages through the command line. I
tested this:
`sbt "testOnly ai.chronon.spark.test.FeatureWithLabelJoinTest" | grep
"== Features DF =="`

- I also verified that the dep tree shows the new logger deps were
present:

```
sbt spark/dependencyTree

[info] welcome to sbt 1.8.2 (Oracle Corporation Java 17.0.2)
[info] loading settings for project chronon-build from plugins.sbt ...
[info] loading project definition from /Users/thomaschow/zipline-ai/chronon/project
[info] loading settings for project root from build.sbt,version.sbt ...
[info] resolving key references (13698 settings) ...
[info] set current project to chronon (in build file:/Users/thomaschow/zipline-ai/chronon/)
[info] spark:spark_2.12:0.1.0-SNAPSHOT [S]
[info]   +-aggregator:aggregator_2.12:0.1.0-SNAPSHOT [S]
[info]   | +-api:api_2.12:0.1.0-SNAPSHOT [S]
[info]   | | +-org.scala-lang.modules:scala-collection-compat_2.12:2.11.0 [S]
[info]   | | +-org.scala-lang:scala-reflect:2.12.18 [S]
[info]   | |
[info]   | +-com.google.code.gson:gson:2.10.1
[info]   | +-org.apache.datasketches:datasketches-java:6.1.0
[info]   |   +-org.apache.datasketches:datasketches-memory:3.0.1
[info]   |
[info]   +-ch.qos.logback:logback-classic:1.2.11
[info]   | +-ch.qos.logback:logback-core:1.2.11
[info]   | +-org.slf4j:slf4j-api:1.7.36
[info]   |
[info]   +-com.google.guava:guava:33.3.1-jre
[info]   | +-com.google.code.findbugs:jsr305:3.0.2
[info]   | +-com.google.errorprone:error_prone_annotations:2.28.0
[info]   | +-com.google.guava:failureaccess:1.0.2
[info]   | +-com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-gu..
[info]   | +-com.google.j2objc:j2objc-annotations:3.0.0
[info]   | +-org.checkerframework:checker-qual:3.43.0
[info]   |
[info]   +-jakarta.servlet:jakarta.servlet-api:4.0.3
[info]   +-online:online_2.12:0.1.0-SNAPSHOT [S]
[info]     +-aggregator:aggregator_2.12:0.1.0-SNAPSHOT [S]
[info]     | +-api:api_2.12:0.1.0-SNAPSHOT [S]
[info]     | | +-org.scala-lang.modules:scala-collection-compat_2.12:2.11.0 [S]
[info]     | | +-org.scala-lang:scala-reflect:2.12.18 [S]
[info]     | |
[info]     | +-com.google.code.gson:gson:2.10.1
[info]     | +-org.apache.datasketches:datasketches-java:6.1.0
[info]     |   +-org.apache.datasketches:datasketches-memory:3.0.1
[info]     |
[info]     +-com.datadoghq:java-dogstatsd-client:4.4.1
[info]     | +-com.github.jnr:jnr-unixsocket:0.36
[info]     |   +-com.github.jnr:jnr-constants:0.9.17
[info]     |   +-com.github.jnr:jnr-enxio:0.30
[info]     |   | +-com.github.jnr:jnr-constants:0.9.17
[info]     |   | +-com.github.jnr:jnr-ffi:2.1.16
[info]     |   |   +-com.github.jnr:jffi:1.2.23
[info]     |   |   +-com.github.jnr:jnr-a64asm:1.0.0
[info]     |   |   +-com.github.jnr:jnr-x86asm:1.0.2
[info]     |   |   +-org.ow2.asm:asm-analysis:7.1
[info]     |   |   | +-org.ow2.asm:asm-tree:7.1
[info]     |   |   |   +-org.ow2.asm:asm:7.1
[info]     |   |   |
[info]     |   |   +-org.ow2.asm:asm-commons:7.1
[info]     |   |   | +-org.ow2.asm:asm-analysis:7.1
[info]     |   |   | | +-org.ow2.asm:asm-tree:7.1
[info]     |   |   | |   +-org.ow2.asm:asm:7.1
[info]     |   |   | |
[info]     |   |   | +-org.ow2.asm:asm-tree:7.1
[info]     |   |   | | +-org.ow2.asm:asm:7.1
[info]     |   |   | |
[info]     |   |   | +-org.ow2.asm:asm:7.1
[info]     |   |   |
[info]     |   |   +-org.ow2.asm:asm-tree:7.1
[info]     |   |   | +-org.ow2.asm:asm:7.1
[info]     |   |   |
[info]     |   |   +-org.ow2.asm:asm-util:7.1
[info]     |   |   | +-org.ow2.asm:asm-analysis:7.1
[info]     |   |   | | +-org.ow2.asm:asm-tree:7.1
[info]     |   |   | |   +-org.ow2.asm:asm:7.1
[info]     |   |   | |
[info]     |   |   | +-org.ow2.asm:asm-tree:7.1
[info]     |   |   | | +-org.ow2.asm:asm:7.1
[info]     |   |   | |
[info]     |   |   | +-org.ow2.asm:asm:7.1
[info]     |   |   |
[info]     |   |   +-org.ow2.asm:asm:7.1
[info]     |   |
[info]     |   +-com.github.jnr:jnr-ffi:2.1.16
[info]     |   | +-com.github.jnr:jffi:1.2.23
[info]     |   | +-com.github.jnr:jnr-a64asm:1.0.0
[info]     |   | +-com.github.jnr:jnr-x86asm:1.0.2
[info]     |   | +-org.ow2.asm:asm-analysis:7.1
[info]     |   | | +-org.ow2.asm:asm-tree:7.1
[info]     |   | |   +-org.ow2.asm:asm:7.1
[info]     |   | |
[info]     |   | +-org.ow2.asm:asm-commons:7.1
[info]     |   | | +-org.ow2.asm:asm-analysis:7.1
[info]     |   | | | +-org.ow2.asm:asm-tree:7.1
[info]     |   | | |   +-org.ow2.asm:asm:7.1
[info]     |   | | |
[info]     |   | | +-org.ow2.asm:asm-tree:7.1
[info]     |   | | | +-org.ow2.asm:asm:7.1
[info]     |   | | |
[info]     |   | | +-org.ow2.asm:asm:7.1
[info]     |   | |
[info]     |   | +-org.ow2.asm:asm-tree:7.1
[info]     |   | | +-org.ow2.asm:asm:7.1
[info]     |   | |
[info]     |   | +-org.ow2.asm:asm-util:7.1
[info]     |   | | +-org.ow2.asm:asm-analysis:7.1
[info]     |   | | | +-org.ow2.asm:asm-tree:7.1
[info]     |   | | |   +-org.ow2.asm:asm:7.1
[info]     |   | | |
[info]     |   | | +-org.ow2.asm:asm-tree:7.1
[info]     |   | | | +-org.ow2.asm:asm:7.1
[info]     |   | | |
[info]     |   | | +-org.ow2.asm:asm:7.1
[info]     |   | |
[info]     |   | +-org.ow2.asm:asm:7.1
[info]     |   |
[info]     |   +-com.github.jnr:jnr-posix:3.0.61
[info]     |     +-com.github.jnr:jnr-constants:0.9.17
[info]     |     +-com.github.jnr:jnr-ffi:2.1.16
[info]     |       +-com.github.jnr:jffi:1.2.23
[info]     |       +-com.github.jnr:jnr-a64asm:1.0.0
[info]     |       +-com.github.jnr:jnr-x86asm:1.0.2
[info]     |       +-org.ow2.asm:asm-analysis:7.1
[info]     |       | +-org.ow2.asm:asm-tree:7.1
[info]     |       |   +-org.ow2.asm:asm:7.1
[info]     |       |
[info]     |       +-org.ow2.asm:asm-commons:7.1
[info]     |       | +-org.ow2.asm:asm-analysis:7.1
[info]     |       | | +-org.ow2.asm:asm-tree:7.1
[info]     |       | |   +-org.ow2.asm:asm:7.1
[info]     |       | |
[info]     |       | +-org.ow2.asm:asm-tree:7.1
[info]     |       | | +-org.ow2.asm:asm:7.1
[info]     |       | |
[info]     |       | +-org.ow2.asm:asm:7.1
[info]     |       |
[info]     |       +-org.ow2.asm:asm-tree:7.1
[info]     |       | +-org.ow2.asm:asm:7.1
[info]     |       |
[info]     |       +-org.ow2.asm:asm-util:7.1
[info]     |       | +-org.ow2.asm:asm-analysis:7.1
[info]     |       | | +-org.ow2.asm:asm-tree:7.1
[info]     |       | |   +-org.ow2.asm:asm:7.1
[info]     |       | |
[info]     |       | +-org.ow2.asm:asm-tree:7.1
[info]     |       | | +-org.ow2.asm:asm:7.1
[info]     |       | |
[info]     |       | +-org.ow2.asm:asm:7.1
[info]     |       |
[info]     |       +-org.ow2.asm:asm:7.1
[info]     |
[info]     +-com.fasterxml.jackson.core:jackson-core:2.15.2
[info]     +-com.fasterxml.jackson.core:jackson-databind:2.15.2
[info]     | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2
[info]     | +-com.fasterxml.jackson.core:jackson-core:2.15.2
[info]     |
[info]     +-com.fasterxml.jackson.module:jackson-module-scala_2.12:2.15.2 [S]
[info]     | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2
[info]     | +-com.fasterxml.jackson.core:jackson-core:2.15.2
[info]     | +-com.fasterxml.jackson.core:jackson-databind:2.15.2
[info]     | | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2
[info]     | | +-com.fasterxml.jackson.core:jackson-core:2.15.2
[info]     | |
[info]     | +-com.thoughtworks.paranamer:paranamer:2.8
[info]     |
[info]     +-com.github.ben-manes.caffeine:caffeine:3.1.8
[info]     | +-com.google.errorprone:error_prone_annotations:2.21.1 (evicted by: 2.28..
[info]     | +-com.google.errorprone:error_prone_annotations:2.28.0
[info]     | +-org.checkerframework:checker-qual:3.37.0 (evicted by: 3.43.0)
[info]     | +-org.checkerframework:checker-qual:3.43.0
[info]     |
[info]     +-net.jodah:typetools:0.6.3
[info]     +-org.rogach:scallop_2.12:5.1.0 [S]
[info]     +-org.scala-lang.modules:scala-java8-compat_2.12:1.0.2 [S]
```

- Additional steps are required for Intellij to behave the same way. I
needed to configure the classpath `-cp chronon.spark` in the run
configuration:
<img width="953" alt="Screenshot 2024-11-21 at 3 34 50 PM"
src="https://github.com/user-attachments/assets/aebbc466-a207-43d0-9f6f-a9bfa811eb66">

and same for `ScalaTest` .

I updated the local setup to reflect this:
https://docs.google.com/document/d/1k9_aQ3tkW5wvzKyXSsWWPK6HZxX4t8zVPVThODpZqQs/edit?tab=t.0#heading=h.en6opahtqp7u

## Checklist
- [x] Added Unit Tests
- [x] Covered by existing CI
- [x] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new logging configuration with a `logback.xml` file for
enhanced logging capabilities.
- Added support for overriding dependencies in the build configuration
for improved dependency management.

- **Bug Fixes**
- Ensured consistent logging library versions across the project to
avoid potential conflicts.

- **Chores**
- Streamlined dependency declarations for better organization within the
build configuration.
	- Improved logging feedback during the build process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Hopefully the last massive PR, since we have a lot of the baseline
implemented here. Here is a [video
walkthrough](https://drive.google.com/file/d/1gWpBHD7sDt2Kz7net73w1H-lY8rzVt3p/view?usp=sharing).

Main changes are noted:
- Figma match (left sidebar, models table, observability page)
- Geist font is used in ECharts
- Geist font has proper smoothness to match figma
- Custom tooltip treatment below chart on hover (hold cmd to lock and
keep it open)
- Drill down charts using sample data

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Introduced new components: `ActionButtons`, `TrueFalseBadge`,
`CustomEChartLegend`, `EChartTooltip`, and `InfoTooltip`.
	- Enhanced date range selection with new options and improved handling.
- Added custom tooltip functionality for charts, improving
interactivity.
- Implemented a new `load` function for redirecting users from the root
path to `/models`.

- **Improvements**
- Updated styling for various components, enhancing visual consistency
and user experience.
	- Refactored navigation and layout components for better usability.
- Enhanced chart interactions and visibility management in the model
performance visualization.
	- Improved color management system with new CSS custom properties.
- Updated font size and color configurations in Tailwind CSS for better
customization.

- **Bug Fixes**
	- Corrected typos and improved variable naming for clarity.

- **Chores**
- Updated dependencies and improved documentation for better
maintainability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Fixes https://github.com/zipline-ai/chronon/security/dependabot/5

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the `package.json` to include an `overrides` section for the
`cross-spawn` package, specifying version `^7.0.6`.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
ironing out a couple of bugs in drift metrics

## Checklist
- [x] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced error handling for percentiles metric to manage null values
effectively.
- Improved logging and filtering in the summarization process for better
clarity and performance.

- **Bug Fixes**
- Strengthened test assertions for drift and summary series to ensure
data integrity and accuracy.

- **Tests**
- Updated test logic to aggregate null counts and total entries,
enhancing the robustness of the testing framework.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Updated the `.gitignore` file to exclude Elastic Search-related data
from version control.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
This makes DYNAMO_ENDPOINT and AWS_DEFAULT_REGION optional values
instead of required. It also allows the app and frontend to be on
different ip addresses, managed by kubernetes.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced handling of allowed hosts and CORS settings for better
integration with various services.
- **Bug Fixes**
- Improved flexibility in the initialization logic of the
DynamoDbClient, allowing for optional environment variables without
throwing immediate exceptions.
- **Chores**
- Updated environment variable declarations in the Dockerfile for
standardized syntax.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced command dialog functionality with a new message display for
empty search results.
- Improved user feedback when searches yield no results, providing a
clearer indication of the outcome.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
![snyk-top-banner](https://github.com/andygongea/OWASP-Benchmark/assets/818805/c518c423-16fe-447e-b67f-ad5a49b5d123)

### Snyk has created this PR to fix 1 vulnerabilities in the pip
dependencies of this project.

#### Snyk changed the following file(s):

- `docker-init/requirements.txt`



<details>
<summary>⚠️ <b>Warning</b></summary>

```
oto3 1.28.62 has requirement botocore<1.32.0,>=1.31.62, but you have botocore 1.33.13.
```

</details>





---

> [!IMPORTANT]
>
> - Check the changes in this PR to ensure they won't cause issues with
your project.
> - Max score is 1000. Note that the real score may have changed since
the PR was raised.
> - This PR was automatically created by Snyk using the credentials of a
real user.
> - Snyk has automatically assigned this pull request, [set who gets
assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings/integration).
> - Some vulnerabilities couldn't be fully fixed and so Snyk will still
find them when the project is tested again. This may be because the
vulnerability existed within more than one direct dependency, but not
all of the affected dependencies could be upgraded.

---

**Note:** _You are seeing this because you or someone else with access
to this repository has authorized Snyk to open fix PRs._

For more information: <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiI5NDMzYzNmNy0zNDk5LTRkZjAtODkwMi1iOTViZGQ5MWZkYTQiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6Ijk0MzNjM2Y3LTM0OTktNGRmMC04OTAyLWI5NWJkZDkxZmRhNCJ9fQ=="
width="0" height="0"/>
🧐 [View latest project
report](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr)
👩‍💻 [Set who automatically gets
assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings/integration)
📜 [Customise PR
templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates)
🛠 [Adjust project
settings](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings)
📚 [Read about Snyk's upgrade
logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities)

---

**Learn how to fix vulnerabilities with free interactive lessons:**

🦉 [Learn about vulnerability in an interactive lesson of Snyk
Learn.](https://learn.snyk.io/?loc&#x3D;fix-pr)

[//]: #
'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"aiohttp","from":"3.8.6","to":"3.10.11"}],"env":"prod","issuesToFix":["SNYK-PYTHON-AIOHTTP-8383923"],"prId":"9433c3f7-3499-4df0-8902-b95bdd91fda4","prPublicId":"9433c3f7-3499-4df0-8902-b95bdd91fda4","packageManager":"pip","priorityScoreList":[601],"projectPublicId":"736d6244-1782-4006-a12e-fdfbd8a4a213","projectUrl":"https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["updated-fix-title","pr-warning-shown","priorityScore"],"type":"auto","upgrade":[],"vulns":["SNYK-PYTHON-AIOHTTP-8383923"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}'

Co-authored-by: snyk-bot <[email protected]>
Co-authored-by: tchow <[email protected]>
![snyk-top-banner](https://github.com/andygongea/OWASP-Benchmark/assets/818805/c518c423-16fe-447e-b67f-ad5a49b5d123)


<h3>Snyk has created this PR to upgrade tailwind-merge from 2.5.3 to
2.5.4.</h3>

:information_source: Keep your dependencies up-to-date. This makes it
easier to fix existing vulnerabilities and to more quickly identify and
fix newly disclosed vulnerabilities when they affect your project.

<hr/>


- The recommended version is **4 versions** ahead of your current
version.

- The recommended version was released on **a month ago**.



<details>
<summary><b>Release notes</b></summary>
<br/>
  <details>
    <summary>Package name: <b>tailwind-merge</b></summary>
    <ul>
      <li>
<b>2.5.4</b> - <a
href="https://github.com/dcastil/tailwind-merge/releases/tag/v2.5.4">2024-10-14</a></br><h3>Bug
Fixes</h3>
<ul>
<li>Fix incorrect paths within sourcemaps by <a class="user-mention
notranslate" data-hovercard-type="user"
data-hovercard-url="/users/dcastil/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/dcastil">@ dcastil</a> in <a class="issue-link
js-issue-link" data-error-text="Failed to load title"
data-id="2585699167" data-permission-text="Title is private"
data-url="dcastil/tailwind-merge#483"
data-hovercard-type="pull_request"
data-hovercard-url="/dcastil/tailwind-merge/pull/483/hovercard"
href="https://github.com/dcastil/tailwind-merge/pull/483">#483</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a class="commit-link"
href="https://github.com/dcastil/tailwind-merge/compare/v2.5.3...v2.5.4"><tt>v2.5.3...v2.5.4</tt></a></p>
<p>Thanks to <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/brandonmcconnell/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/brandonmcconnell">@ brandonmcconnell</a>, <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/manavm1990/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/manavm1990">@ manavm1990</a>, <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/langy/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self" href="https://github.com/langy">@
langy</a>, <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/jamesreaco/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/jamesreaco">@ jamesreaco</a>, <a
class="user-mention notranslate" data-hovercard-type="organization"
data-hovercard-url="/orgs/roboflow/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/roboflow">@ roboflow</a> and <a
class="user-mention notranslate" data-hovercard-type="organization"
data-hovercard-url="/orgs/codecov/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/codecov">@ codecov</a> for sponsoring
tailwind-merge! ❤️</p>
      </li>
      <li>
<b>2.5.4-dev.aac29dcdc25353cd05d708b8528c844a335ac25f</b> - 2024-10-20
      </li>
      <li>
<b>2.5.4-dev.a57f245d6ae3ce80627d4546940972f6e140ead3</b> - 2024-10-14
      </li>
      <li>
<b>2.5.4-dev.4dc0491f877f97cd5b9d7cc6d0bb87c385a0def8</b> - 2024-10-20
      </li>
      <li>
<b>2.5.3</b> - <a
href="https://github.com/dcastil/tailwind-merge/releases/tag/v2.5.3">2024-10-03</a></br><h3>Bug
Fixes</h3>
<ul>
<li>Add missing logical border color properties by <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/sherlockdoyle/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/sherlockdoyle">@ sherlockdoyle</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2561872126" data-permission-text="Title is private"
data-url="dcastil/tailwind-merge#478"
data-hovercard-type="pull_request"
data-hovercard-url="/dcastil/tailwind-merge/pull/478/hovercard"
href="https://github.com/dcastil/tailwind-merge/pull/478">#478</a></li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Add benchmark reporting to PRs and commits by <a class="user-mention
notranslate" data-hovercard-type="user"
data-hovercard-url="/users/XantreDev/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/XantreDev">@ XantreDev</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2459787149" data-permission-text="Title is private"
data-url="dcastil/tailwind-merge#455"
data-hovercard-type="pull_request"
data-hovercard-url="/dcastil/tailwind-merge/pull/455/hovercard"
href="https://github.com/dcastil/tailwind-merge/pull/455">#455</a></li>
</ul>
<h3>Other</h3>
<ul>
<li>Switch test suite to vitest by <a class="user-mention notranslate"
data-hovercard-type="user" data-hovercard-url="/users/dcastil/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/dcastil">@ dcastil</a> in <a class="issue-link
js-issue-link" data-error-text="Failed to load title"
data-id="2472026355" data-permission-text="Title is private"
data-url="dcastil/tailwind-merge#461"
data-hovercard-type="pull_request"
data-hovercard-url="/dcastil/tailwind-merge/pull/461/hovercard"
href="https://github.com/dcastil/tailwind-merge/pull/461">#461</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a class="commit-link"
href="https://github.com/dcastil/tailwind-merge/compare/v2.5.2...v2.5.3"><tt>v2.5.2...v2.5.3</tt></a></p>
<p>Thanks to <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/brandonmcconnell/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/brandonmcconnell">@ brandonmcconnell</a>, <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/manavm1990/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/manavm1990">@ manavm1990</a>, <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/langy/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self" href="https://github.com/langy">@
langy</a>, <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/jamesreaco/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/jamesreaco">@ jamesreaco</a>, <a
class="user-mention notranslate" data-hovercard-type="organization"
data-hovercard-url="/orgs/roboflow/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/roboflow">@ roboflow</a>, <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/xeger/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self" href="https://github.com/xeger">@
xeger</a> and <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/MrDeatHHH/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/MrDeatHHH">@ MrDeatHHH</a> for sponsoring
tailwind-merge! ❤️</p>
      </li>
    </ul>
from <a
href="https://github.com/dcastil/tailwind-merge/releases">tailwind-merge
GitHub release notes</a>
  </details>
</details>

---

> [!IMPORTANT]
>
> - Check the changes in this PR to ensure they won't cause issues with
your project.
> - This PR was automatically created by Snyk using the credentials of a
real user.
> - Snyk has automatically assigned this pull request, [set who gets
assigned](/settings/integration).

---

**Note:** _You are seeing this because you or someone else with access
to this repository has authorized Snyk to open upgrade PRs._

**For more information:** <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiIxMGZlNTA0Zi0yNzUzLTQxYTMtYWZkYi1iZGI0MWY0NDBlMTAiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6IjEwZmU1MDRmLTI3NTMtNDFhMy1hZmRiLWJkYjQxZjQ0MGUxMCJ9fQ=="
width="0" height="0"/>

> - 🧐 [View latest project
report](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)
> - 👩‍💻 [Set who automatically gets
assigned](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr/)
> - 📜 [Customise PR
templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates)
> - 🛠 [Adjust upgrade PR
settings](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)
> - 🔕 [Ignore this dependency or unsubscribe from future upgrade
PRs](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?pkg&#x3D;tailwind-merge&amp;utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr#auto-dep-upgrades)

[//]: #
'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"tailwind-merge","from":"2.5.3","to":"2.5.4"}],"env":"prod","hasFixes":false,"isBreakingChange":false,"isMajorUpgrade":false,"issuesToFix":[],"prId":"10fe504f-2753-41a3-afdb-bdb41f440e10","prPublicId":"10fe504f-2753-41a3-afdb-bdb41f440e10","packageManager":"npm","priorityScoreList":[],"projectPublicId":"f4bdc116-d05b-4937-96b5-b1f9a02872e5","projectUrl":"https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source=github&utm_medium=referral&page=upgrade-pr","prType":"upgrade","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":[],"type":"auto","upgrade":[],"upgradeInfo":{"versionsDiff":4,"publishedDate":"2024-10-14T11:21:41.676Z"},"vulns":[]}'

Co-authored-by: snyk-bot <[email protected]>
Co-authored-by: tchow <[email protected]>
## Summary
I have moved everything from `/models/model_name` to `/joins/join_name`.
I also created a shared entity object and show groupbys and joins in
search results. PR walkthrough video
[here](https://drive.google.com/file/d/10lnso4MGXuXlmr5F-aLzDBwWncGBuEmt/view?usp=sharing)

Limitations:
- You can't click on a model or groupby from search
- Backend search only queries models (so matches to joins or groupbys do
not come up)
([details](https://github.com/zipline-ai/chronon/pull/82/files#r1855120136))

Future:
- Removing anything not related to joins (model performance, skew, etc)

## Checklist
- [x] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Enhanced navigation with dynamic filtering of entities in the
NavigationBar.
- Introduced a detailed table view for "Joins" displaying relevant model
information.

- **Bug Fixes**
  - Updated redirection from the root URL to the "Joins" page.

- **Removals**
- Removed outdated placeholder components for "GroupBys" and "Models"
pages.

These updates improve user navigation and provide a more informative
interface for managing joins and models.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

Ran into this strange and hard to reproduce error with a new version of
vite. When switching branches occasionally the frontend fails to load on
the local dev server. Kept getting this message:
```
The file does not exist at "/Users/kenmorton/Documents/Code/chronon/frontend/node_modules/.vite/deps/chunk-EPOQRJ6F.js?v=efc5098d" which is in the optimize deps directory. The dependency might be incompatible with the dep optimizer. Try adding it to `optimizeDeps.exclude`. (x11)
```

Found some folks running into the same thing - they recommended [this
fix](vitejs/vite#17738 (comment))
and it seems to work so far.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated dependency optimization settings to enhance build performance
by excluding the `.vite` directory instead of `node_modules/.cache`.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…rn group across 1 directory (#88)

Bumps the npm_and_yarn group with 1 update in the /frontend directory:
[@sveltejs/kit](https://github.com/sveltejs/kit/tree/HEAD/packages/kit).

Updates `@sveltejs/kit` from 2.6.2 to 2.8.3
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/sveltejs/kit/releases"><code>@​sveltejs/kit</code>'s
releases</a>.</em></p>
<blockquote>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.8.3</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: ensure error messages are escaped (<a
href="https://github.com/sveltejs/kit/pull/13050">#13050</a>)</p>
</li>
<li>
<p>fix: escape values included in dev 404 page (<a
href="https://github.com/sveltejs/kit/pull/13039">#13039</a>)</p>
</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.8.2</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: prevent duplicate fetch request when using Request with load
function's fetch (<a
href="https://github.com/sveltejs/kit/pull/13023">#13023</a>)</p>
</li>
<li>
<p>fix: do not override default cookie decoder to allow users to
override the <code>cookie</code> library version (<a
href="https://github.com/sveltejs/kit/pull/13037">#13037</a>)</p>
</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.8.1</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: only add nonce to <code>script-src-elem</code>,
<code>style-src-attr</code> and <code>style-src-elem</code> CSP
directives when <code>unsafe-inline</code> is not present (<a
href="https://github.com/sveltejs/kit/pull/11613">#11613</a>)</p>
</li>
<li>
<p>fix: support HTTP/2 in dev and production. Revert the changes from <a
href="https://github.com/sveltejs/kit/pull/12907">#12907</a> to
downgrade HTTP/2 to TLS as now being unnecessary (<a
href="https://github.com/sveltejs/kit/pull/12989">#12989</a>)</p>
</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.8.0</h2>
<h3>Minor Changes</h3>
<ul>
<li>feat: add helper to identify <code>ActionFailure</code> objects (<a
href="https://github.com/sveltejs/kit/pull/12878">#12878</a>)</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.7.7</h2>
<h3>Patch Changes</h3>
<ul>
<li>fix: update link in JSDoc (<a
href="https://github.com/sveltejs/kit/pull/12963">#12963</a>)</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.7.6</h2>
<h3>Patch Changes</h3>
<ul>
<li>fix: update broken links in JSDoc (<a
href="https://github.com/sveltejs/kit/pull/12960">#12960</a>)</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.7.5</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: warn on invalid cookie name characters (<a
href="https://github.com/sveltejs/kit/pull/12806">#12806</a>)</p>
</li>
<li>
<p>fix: when using <code>@vitejs/plugin-basic-ssl</code>, set a no-op
proxy config to downgrade from HTTP/2 to TLS since <code>undici</code>
does not yet enable HTTP/2 by default (<a
href="https://github.com/sveltejs/kit/pull/12907">#12907</a>)</p>
</li>
</ul>
<h2><code>@​sveltejs/kit</code><a
href="https://github.com/2"><code>@​2</code></a>.7.4</h2>
<h3>Patch Changes</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/sveltejs/kit/blob/main/packages/kit/CHANGELOG.md"><code>@​sveltejs/kit</code>'s
changelog</a>.</em></p>
<blockquote>
<h2>2.8.3</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: ensure error messages are escaped (<a
href="https://github.com/sveltejs/kit/pull/13050">#13050</a>)</p>
</li>
<li>
<p>fix: escape values included in dev 404 page (<a
href="https://github.com/sveltejs/kit/pull/13039">#13039</a>)</p>
</li>
</ul>
<h2>2.8.2</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: prevent duplicate fetch request when using Request with load
function's fetch (<a
href="https://github.com/sveltejs/kit/pull/13023">#13023</a>)</p>
</li>
<li>
<p>fix: do not override default cookie decoder to allow users to
override the <code>cookie</code> library version (<a
href="https://github.com/sveltejs/kit/pull/13037">#13037</a>)</p>
</li>
</ul>
<h2>2.8.1</h2>
<h3>Patch Changes</h3>
<ul>
<li>
<p>fix: only add nonce to <code>script-src-elem</code>,
<code>style-src-attr</code> and <code>style-src-elem</code> CSP
directives when <code>unsafe-inline</code> is not present (<a
href="https://github.com/sveltejs/kit/pull/11613">#11613</a>)</p>
</li>
<li>
<p>fix: support HTTP/2 in dev and production. Revert the changes from <a
href="https://github.com/sveltejs/kit/pull/12907">#12907</a> to
downgrade HTTP/2 to TLS as now being unnecessary (<a
href="https://github.com/sveltejs/kit/pull/12989">#12989</a>)</p>
</li>
</ul>
<h2>2.8.0</h2>
<h3>Minor Changes</h3>
<ul>
<li>feat: add helper to identify <code>ActionFailure</code> objects (<a
href="https://github.com/sveltejs/kit/pull/12878">#12878</a>)</li>
</ul>
<h2>2.7.7</h2>
<h3>Patch Changes</h3>
<ul>
<li>fix: update link in JSDoc (<a
href="https://github.com/sveltejs/kit/pull/12963">#12963</a>)</li>
</ul>
<h2>2.7.6</h2>
<h3>Patch Changes</h3>
<ul>
<li>fix: update broken links in JSDoc (<a
href="https://github.com/sveltejs/kit/pull/12960">#12960</a>)</li>
</ul>
<h2>2.7.5</h2>
<h3>Patch Changes</h3>
<ul>
<li>fix: warn on invalid cookie name characters (<a
href="https://github.com/sveltejs/kit/pull/12806">#12806</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/sveltejs/kit/commit/429bfb74fe823ea13a5fa0547dcf4cd6bb358a93"><code>429bfb7</code></a>
Version Packages (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13049">#13049</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/134e36343ef57ed7e6e2b3bb9e7f05ad37865794"><code>134e363</code></a>
fix: ensure error messages are escaped (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13050">#13050</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/d338d4635a7fd947ba5112df6ee632c4a0979438"><code>d338d46</code></a>
fix: escape values included in dev 404 page (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13039">#13039</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/5f8399d88fd9461a6111e03e6168067fba42e2c1"><code>5f8399d</code></a>
Version Packages (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13024">#13024</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/1358cccd52190df3c74bdd8970dbfb06ffc4ec72"><code>1358ccc</code></a>
fix: use default cookie decoder (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13037">#13037</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/570562b74d9e9f295d9b617478088a650f51e96b"><code>570562b</code></a>
fix: handle empty Headers when serialising Request passed to fetch (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/13023">#13023</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/435984bf61b047d1e1a8efe88354ca7ac4e9109f"><code>435984b</code></a>
Version Packages (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12992">#12992</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/0bd4426944ce9995b86199900e39c8d3929fa2f2"><code>0bd4426</code></a>
fix: support custom servers using HTTP/2 in production (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12989">#12989</a>)</li>
<li><a
href="https://github.com/sveltejs/kit/commit/6df00fc8448bf72c91e8f6faee0605995b0fdd65"><code>6df00fc</code></a>
fix: csp nonce in <code>script-src-elem</code>,
<code>style-src-attr</code> and <code>style-src-elem</code> wh...</li>
<li><a
href="https://github.com/sveltejs/kit/commit/c717db91236c7ab15045b296c73201c6c6ecd6fa"><code>c717db9</code></a>
chore: update playground and add an endpoint (<a
href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12983">#12983</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/sveltejs/kit/commits/@sveltejs/[email protected]/packages/kit">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=@sveltejs/kit&package-manager=npm_and_yarn&previous-version=2.6.2&new-version=2.8.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/zipline-ai/chronon/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ken Morton <[email protected]>
![snyk-top-banner](https://github.com/andygongea/OWASP-Benchmark/assets/818805/c518c423-16fe-447e-b67f-ad5a49b5d123)

### Snyk has created this PR to fix 9 vulnerabilities in the pip
dependencies of this project.

#### Snyk changed the following file(s):

- `docker-init/requirements.txt`



<details>
<summary>⚠️ <b>Warning</b></summary>

```
oto3 1.28.62 has requirement botocore<1.32.0,>=1.31.62, but you have botocore 1.33.13.
```

</details>





---

> [!IMPORTANT]
>
> - Check the changes in this PR to ensure they won't cause issues with
your project.
> - Max score is 1000. Note that the real score may have changed since
the PR was raised.
> - This PR was automatically created by Snyk using the credentials of a
real user.
> - Snyk has automatically assigned this pull request, [set who gets
assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings/integration).
> - Some vulnerabilities couldn't be fully fixed and so Snyk will still
find them when the project is tested again. This may be because the
vulnerability existed within more than one direct dependency, but not
all of the affected dependencies could be upgraded.

---

**Note:** _You are seeing this because you or someone else with access
to this repository has authorized Snyk to open fix PRs._

For more information: <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiJjNTE4OGFhMi0yMDI0LTQ0NGItYTFjZi00MWQ2YThmMWE4ZGMiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6ImM1MTg4YWEyLTIwMjQtNDQ0Yi1hMWNmLTQxZDZhOGYxYThkYyJ9fQ=="
width="0" height="0"/>
🧐 [View latest project
report](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr)
👩‍💻 [Set who automatically gets
assigned](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings/integration)
📜 [Customise PR
templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates)
🛠 [Adjust project
settings](https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings)
📚 [Read about Snyk's upgrade
logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities)

---

**Learn how to fix vulnerabilities with free interactive lessons:**

🦉 [Improper Input
Validation](https://learn.snyk.io/lesson/improper-input-validation/?loc&#x3D;fix-pr)
🦉 [Improper Limitation of a Pathname to a Restricted Directory
(&#x27;Path
Traversal&#x27;)](https://learn.snyk.io/lesson/directory-traversal/?loc&#x3D;fix-pr)
🦉 [Cross-site Scripting
(XSS)](https://learn.snyk.io/lesson/xss/?loc&#x3D;fix-pr)

[//]: #
'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"aiohttp","from":"3.8.6","to":"3.10.11"},{"name":"zipp","from":"3.15.0","to":"3.19.1"}],"env":"prod","issuesToFix":["SNYK-PYTHON-AIOHTTP-6091621","SNYK-PYTHON-AIOHTTP-6091622","SNYK-PYTHON-AIOHTTP-6209406","SNYK-PYTHON-AIOHTTP-6209407","SNYK-PYTHON-AIOHTTP-6645291","SNYK-PYTHON-AIOHTTP-6808823","SNYK-PYTHON-AIOHTTP-7675597","SNYK-PYTHON-AIOHTTP-8383923","SNYK-PYTHON-ZIPP-7430899"],"prId":"c5188aa2-2024-444b-a1cf-41d6a8f1a8dc","prPublicId":"c5188aa2-2024-444b-a1cf-41d6a8f1a8dc","packageManager":"pip","priorityScoreList":[591,591,616,646,449,589,529,601,666],"projectPublicId":"736d6244-1782-4006-a12e-fdfbd8a4a213","projectUrl":"https://app.snyk.io/org/varant-zlai/project/736d6244-1782-4006-a12e-fdfbd8a4a213?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["pr-warning-shown","priorityScore"],"type":"auto","upgrade":[],"vulns":["SNYK-PYTHON-AIOHTTP-6091621","SNYK-PYTHON-AIOHTTP-6091622","SNYK-PYTHON-AIOHTTP-6209406","SNYK-PYTHON-AIOHTTP-6209407","SNYK-PYTHON-AIOHTTP-6645291","SNYK-PYTHON-AIOHTTP-6808823","SNYK-PYTHON-AIOHTTP-7675597","SNYK-PYTHON-AIOHTTP-8383923","SNYK-PYTHON-ZIPP-7430899"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}'

Co-authored-by: snyk-bot <[email protected]>
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
	- Updated Docker command syntax for clarity.
	- Added note on required Docker version (20.10 or higher).

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced logging configuration for Spark sessions to reduce verbosity.
	- Improved timing and error handling in the data generation script.
- New method introduced for alternative streaming data handling in
`OnlineUtils`.
- Added a demonstration object for observability features in Spark
applications.
	- New configuration file for structured logging setup.

- **Bug Fixes**
- Adjusted method signatures to ensure clarity and correct parameter
usage in various classes.

- **Documentation**
- Updated import statements to reflect package restructuring for better
organization.
- Added instructions for building and executing the project in the
README.

- **Tests**
- Integrated `MockApi` into various test classes to enhance testing
capabilities and simulate API interactions.
- Enhanced test coverage by utilizing the `MockApi` for more robust
testing scenarios.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
fixes https://github.com/zipline-ai/chronon/security/dependabot/2

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the `frontend` project to include a new dependency on the
`cookie` package (version `^0.7.0`).

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
- Update devnotes based on onboarding doc. 

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
	- Enhanced setup instructions for the Chronon project.
- Expanded prerequisites section with environment variable
configurations.
- Clarified installation instructions for Thrift, emphasizing version
compatibility.
	- Added guidance for installing Java, Scala, and Python using `asdf`.
- Restructured IntelliJ configuration instructions for improved clarity.
- Updated troubleshooting section with commands for project cleaning and
assembly.
- Elaborated on the release process for artifact publishing and code
pushing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
more null handling
persisting cardinality map to remove inconsistent compute of cardinality
map

## Checklist
- [x] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Introduced new constants for `FetchTimeout` (10 minutes) and
`DefaultCharset` (UTF-8).
- Enhanced the `Summarizer` class to utilize an API for key-value store
operations, improving data management.
- Updated the `ObservabilityDemo` to include new time series fetching
capabilities.
- Added a new method `highlight` for string formatting in the
`ColorPrinter`.

- **Bug Fixes**
- Improved null handling in `getSummaries`, `pivot`, `reportKvResponse`,
and `multiGet` methods to prevent potential null pointer exceptions.

- **Documentation**
- Updated logging configuration for enhanced readability and error
management.

- **Tests**
- Increased sample data generation in tests to improve coverage and
accuracy.
- Enhanced clarity of test setups in `GroupByUploadTest` with better
data labeling.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

- Split `test_scala_and_python.yaml` into python, spark scala, and no
spark scala tests.
- https://app.asana.com/0/1208785567265389/1208854398566912/f

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced automated testing workflows for Python and Scala modules,
enhancing continuous integration.
- Added specific workflows for testing non-Spark Scala modules, Spark
modules, and formatting checks for Scala files.

- **Chores**
- Implemented concurrency settings to manage workflow runs and optimize
testing efficiency.
- Updated logging configuration to reduce verbosity and focus on error
messages.
- Removed the combined testing workflow for Python and Scala,
streamlining the testing process.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary


Testing this locally:

I see no spark logging churn. Looks like it preserves @nikhil-zlai 's
PR's behavior: #96

<img width="760" alt="Screenshot 2024-11-26 at 7 55 03 PM"
src="https://github.com/user-attachments/assets/844a44e1-c769-4089-b245-a86d138e1d1a">



## Checklist
- [x] Added Unit Tests
- [x] Covered by existing CI
- [x] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new logging configuration using Log4j2, enhancing logging
capabilities and readability.

- **Bug Fixes**
- Removed outdated logging configuration references, streamlining Docker
container execution and Spark application setup.

- **Chores**
- Updated dependency management to replace Logback with Log4j2 for
consistent logging behavior across the project.
- Enhanced CI/CD workflows to trigger on changes to the `build.sbt`
file, improving responsiveness to updates.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

putting up a simple netty server (no deps added to sbt) and a streamlit
app to have a fast loop for iterating / debugging.

This is how it looks:
<img width="1624" alt="Screenshot 2024-11-26 at 11 03 38 PM"
src="https://github.com/user-attachments/assets/d11c8fac-79b7-4749-bba5-e71e09fa0a72">


Updated docs too.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [x] Integration tested
- [x] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a Streamlit application for visualizing data from an API
endpoint.
- Added a new HTTP server to handle requests related to drift and
summary series, with endpoints for health checks and data retrieval.

- **Improvements**
	- Exposed port 8181 for external access to the Spark application.
- Updated the documentation with clearer instructions for building and
running the application.
- Updated the default value for the start date in configuration
settings.

- **Bug Fixes**
- Enhanced error handling in the data loading process within the
Streamlit app.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
…ontend (#95)

## Summary
Builds on a couple of the summary computation PRs and data generation to
wire things up so that Hub can serve them.
* Yanked out mock data based endpoints (model perf / drift, join &
feature skew) - decided it would be confusing to have a mix of mock and
generated data so we just have the generated data served
* Dropped a few of the scripts introduced in
#87. We bring up our
containers the way and we have a script `load_summaries.sh` that we can
trigger that leverages the existing app container to load data.
* DDB ingestion was taking too long and we were dropping a lot of data
due to rejected execution exceptions. To unblock for now, we've gone
with an approach of making a bulk put HTTP call from the
ObservabilityDemo app -> Hub and Hub utilizing a InMemoryKV store to
persist and serve up features.
* Added an endpoint to serve the join that are configured as we've
switched from the model based world.

There's still an issue to resolve around fetching individual feature
series data. Once I resolve that, we can switch this PR out of wip mode.

To test / run:
start up our docker containers:
```
$ docker-compose -f docker-init/compose.yaml up --build
...
```
In a different term load data:
```
$ ./docker-init/demo/load_summaries.sh 
Done uploading summaries! 🥳
```

You can now curl join & feature time series data.
Join drift (null ratios)
```
curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=null&offset=10h&algorithm=psi'
```

Join drift (value drift)
```
curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=10h&algorithm=psi'
```

Feature drift:
```
curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=aggregates'
```

Feature summaries:
```
curl -X GET   'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join/feature/dim_user_account_type/timeseries?startTs=1673308800000&endTs=1674172800000&metricType=drift&metrics=value&offset=1D&algorithm=psi&granularity=percentile'
```

Join metadata
```
curl -X GET 'http://localhost:9000/api/v1/joins'
curl -X GET 'http://localhost:9000/api/v1/join/risk.user_transactions.txn_join'
```

## Checklist
- [X] Added Unit Tests
- [ ] Covered by existing CI
- [X] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

- **New Features**
- Introduced a new `JoinController` for managing joins with pagination
support.
- Added functionality for an in-memory key-value store with bulk data
upload capabilities.
- Implemented observability demo data loading within a Spark
application.
- Added a new `HTTPKVStore` class for remote key-value store
interactions over HTTP.

- **Improvements**
- Enhanced the `ModelController` and `SearchController` to align with
the new join data structure.
- Updated the `TimeSeriesController` to support asynchronous operations
and improved error handling.
- Refined dependency management in the build configuration for better
clarity and maintainability.
- Updated API routes to include new endpoints for listing and retrieving
joins.
- Updated configuration to replace the `DynamoDBModule` with
`ModelStoreModule`, adding `InMemoryKVStoreModule` and
`DriftStoreModule`.

- **Documentation**
- Revised README instructions for Docker container setup and demo data
loading.
- Updated API routes documentation to reflect new endpoints for joins
and in-memory data operations.

- **Bug Fixes**
- Resolved issues related to error handling in various controllers and
improved logging for better traceability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: nikhil-zlai <[email protected]>
## Summary
Port of our OSS delta lake PR -
airbnb/chronon#869. Largely the same aside from
delta lake versions. We don't need this immediately atm but we'll need
this if we have other users come along that need delta lake (or we need
to add support for formats like hudi)

## Checklist
- [X] Added Unit Tests
- [X] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for Delta Lake operations with new dependencies and
configurations.
- Introduced new traits and case objects for handling different table
formats, enhancing data management capabilities.
- Added a new job in the CI workflow for testing Delta Lake format
functionality.

- **Bug Fixes**
	- Improved error handling in class registration processes.

- **Tests**
- Implemented a suite of unit tests for the `TableUtils` class to
validate partitioned data insertions with schema modifications.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
![snyk-top-banner](https://github.com/andygongea/OWASP-Benchmark/assets/818805/c518c423-16fe-447e-b67f-ad5a49b5d123)

### Snyk has created this PR to fix 1 vulnerabilities in the pip
dependencies of this project.

#### Snyk changed the following file(s):

- `quickstart/requirements.txt`



<details>
<summary>⚠️ <b>Warning</b></summary>

```
otebook 6.5.7 requires pyzmq, which is not installed.
jupyter-server 1.24.0 requires pyzmq, which is not installed.
jupyter-console 6.6.3 requires pyzmq, which is not installed.
jupyter-client 7.4.9 requires pyzmq, which is not installed.
ipykernel 6.16.2 requires pyzmq, which is not installed.
```

</details>





---

> [!IMPORTANT]
>
> - Check the changes in this PR to ensure they won't cause issues with
your project.
> - Max score is 1000. Note that the real score may have changed since
the PR was raised.
> - This PR was automatically created by Snyk using the credentials of a
real user.
> - Snyk has automatically assigned this pull request, [set who gets
assigned](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings/integration).
> - Some vulnerabilities couldn't be fully fixed and so Snyk will still
find them when the project is tested again. This may be because the
vulnerability existed within more than one direct dependency, but not
all of the affected dependencies could be upgraded.

---

**Note:** _You are seeing this because you or someone else with access
to this repository has authorized Snyk to open fix PRs._

For more information: <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiIwMzFkYThmYS1hY2ZmLTQ5OTgtOWM3NS04YjhlZDAxNTU1YmUiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6IjAzMWRhOGZhLWFjZmYtNDk5OC05Yzc1LThiOGVkMDE1NTViZSJ9fQ=="
width="0" height="0"/>
🧐 [View latest project
report](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr)
👩‍💻 [Set who automatically gets
assigned](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings/integration)
📜 [Customise PR
templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates)
🛠 [Adjust project
settings](https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;fix-pr/settings)
📚 [Read about Snyk's upgrade
logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities)

---

**Learn how to fix vulnerabilities with free interactive lessons:**

🦉 [Regular Expression Denial of Service
(ReDoS)](https://learn.snyk.io/lesson/redos/?loc&#x3D;fix-pr)

[//]: #
'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"tornado","from":"6.2","to":"6.4.2"}],"env":"prod","issuesToFix":["SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708","SNYK-PYTHON-TORNADO-8400708"],"prId":"031da8fa-acff-4998-9c75-8b8ed01555be","prPublicId":"031da8fa-acff-4998-9c75-8b8ed01555be","packageManager":"pip","priorityScoreList":[631],"projectPublicId":"e1ca9fce-fa39-4376-afef-0fb43b4e13d3","projectUrl":"https://app.snyk.io/org/varant-zlai/project/e1ca9fce-fa39-4376-afef-0fb43b4e13d3?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["updated-fix-title","pr-warning-shown","priorityScore"],"type":"auto","upgrade":[],"vulns":["SNYK-PYTHON-TORNADO-8400708"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}'

Co-authored-by: snyk-bot <[email protected]>
… build (#102)

## Summary
Speed up our local obs iteration flow by pulling all the forced sbt
clean + build steps outside the docker compose build flow. We don't need
to build the spark assembly, frontend and hub every time - we often just
need to build one / two of these. This PR pulls the build piece out of
the docker file so that we don't have to do it every time. Instead we
wrap the build in a script and invoke the relevant build targets. The
docker file copies the relevant artifacts over. This allows us to do
things like:

Just build the hub webservice:
```
 ./docker-init/build.sh --hub 
```
Just build the spark assemblies:
```
 ./docker-init/build.sh --spark
```

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [X] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a new build script (`build.sh`) for easier module building
and management.
- **Improvements**
- Simplified Dockerfile structure by removing multi-stage builds for
both application and frontend.
- Updated README to reflect new setup instructions and automated build
processes.
- Removed unnecessary service dependencies in the Docker Compose
configuration.
- **Documentation**
- Enhanced clarity and detail in README regarding Docker environment
setup and usage.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Add a isNumeric field to help the frontend code decide if a time series
response for a feature is numeric / categorical. Currently this is a bit
hacky and based on the label being a percentile string or not (p0,
p10,..).

## Checklist
- [ ] Added Unit Tests
- [X] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced handling of numeric and categorical features in time series
data.
- Introduced flags to indicate the numeric status of current and
baseline series.
  
- **Bug Fixes**
- Improved robustness in processing time series data for accurate
representation and analysis.

- **Documentation**
- Updated method signatures to reflect changes in handling numeric
features.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
varant-zlai and others added 7 commits February 10, 2025 09:43
![snyk-top-banner](https://github.com/andygongea/OWASP-Benchmark/assets/818805/c518c423-16fe-447e-b67f-ad5a49b5d123)


<h3>Snyk has created this PR to upgrade tailwind-variants from 0.3.0 to
0.3.1.</h3>

:information_source: Keep your dependencies up-to-date. This makes it
easier to fix existing vulnerabilities and to more quickly identify and
fix newly disclosed vulnerabilities when they affect your project.

<hr/>


- The recommended version is **1 version** ahead of your current
version.

- The recommended version was released **22 days ago**.



<details>
<summary><b>Release notes</b></summary>
<br/>
  <details>
    <summary>Package name: <b>tailwind-variants</b></summary>
    <ul>
      <li>
<b>0.3.1</b> - <a
href="https://github.com/heroui-inc/tailwind-variants/releases/tag/v0.3.1">2025-01-18</a></br><h2>What's
Changed</h2>
<ul>
<li>fix: github workflow by <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/tianenpang/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/tianenpang">@ tianenpang</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2652096796" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#222"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/222/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/222">#222</a></li>
<li>chore: update repo link &amp; content by <a class="user-mention
notranslate" data-hovercard-type="user"
data-hovercard-url="/users/wingkwong/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/wingkwong">@ wingkwong</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2795337563" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#235"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/235/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/235">#235</a></li>
<li>chore: org name change by <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/jrgarciadev/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/jrgarciadev">@ jrgarciadev</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2797166923" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#237"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/237/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/237">#237</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/wingkwong/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/wingkwong">@ wingkwong</a> made their
first contribution in <a class="issue-link js-issue-link"
data-error-text="Failed to load title" data-id="2795337563"
data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#235"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/235/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/235">#235</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a class="commit-link"
href="https://github.com/heroui-inc/tailwind-variants/compare/v0.3.0...v0.3.1"><tt>v0.3.0...v0.3.1</tt></a></p>
      </li>
      <li>
<b>0.3.0</b> - <a
href="https://github.com/heroui-inc/tailwind-variants/releases/tag/v0.3.0">2024-11-12</a></br><h2>What's
Changed</h2>
<ul>
<li>fix mergeObjects order by <a class="user-mention notranslate"
data-hovercard-type="user"
data-hovercard-url="/users/thefalked/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/thefalked">@ thefalked</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2196305299" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#172"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/172/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/172">#172</a></li>
<li>Add ESLint Jest plugin and update ESLint/Prettier by <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/mskelton/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/mskelton">@ mskelton</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2198990776" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#173"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/173/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/173">#173</a></li>
<li>fix(transformer): add transformer config type to withTV function by
<a class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/jonathassardinha/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/jonathassardinha">@
jonathassardinha</a> in <a class="issue-link js-issue-link"
data-error-text="Failed to load title" data-id="2218792265"
data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#177"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/177/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/177">#177</a></li>
<li>docs: add <code>cva</code> to benchmarks by <a class="user-mention
notranslate" data-hovercard-type="user"
data-hovercard-url="/users/mskelton/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/mskelton">@ mskelton</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2229221713" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#178"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/178/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/178">#178</a></li>
<li>(fix): responsive variants for base when slots are present by <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/w0ofy/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/w0ofy">@ w0ofy</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2357923964" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#202"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/202/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/202">#202</a></li>
<li>fix: treat undefined value for compoundVariants as false by <a
class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/Tokky0425/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/Tokky0425">@ Tokky0425</a> in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2459811451" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#210"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/210/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/210">#210</a></li>
<li>chore: tailwind-merge updated to v2.5.4</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/jonathassardinha/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/jonathassardinha">@
jonathassardinha</a> made their first contribution in <a
class="issue-link js-issue-link" data-error-text="Failed to load title"
data-id="2218792265" data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#177"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/177/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/177">#177</a></li>
<li><a class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/w0ofy/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/w0ofy">@ w0ofy</a> made their first
contribution in <a class="issue-link js-issue-link"
data-error-text="Failed to load title" data-id="2357923964"
data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#202"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/202/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/202">#202</a></li>
<li><a class="user-mention notranslate" data-hovercard-type="user"
data-hovercard-url="/users/Tokky0425/hovercard"
data-octo-click="hovercard-link-click"
data-octo-dimensions="link_type:self"
href="https://github.com/Tokky0425">@ Tokky0425</a> made their
first contribution in <a class="issue-link js-issue-link"
data-error-text="Failed to load title" data-id="2459811451"
data-permission-text="Title is private"
data-url="heroui-inc/tailwind-variants#210"
data-hovercard-type="pull_request"
data-hovercard-url="/heroui-inc/tailwind-variants/pull/210/hovercard"
href="https://github.com/heroui-inc/tailwind-variants/pull/210">#210</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a class="commit-link"
href="https://github.com/heroui-inc/tailwind-variants/compare/v0.2.1...v0.3.0"><tt>v0.2.1...v0.3.0</tt></a></p>
      </li>
    </ul>
from <a
href="https://github.com/heroui-inc/tailwind-variants/releases">tailwind-variants
GitHub release notes</a>
  </details>
</details>

---

> [!IMPORTANT]
>
> - Check the changes in this PR to ensure they won't cause issues with
your project.
> - This PR was automatically created by Snyk using the credentials of a
real user.
> - Snyk has automatically assigned this pull request, [set who gets
assigned](/settings/integration).

---

**Note:** _You are seeing this because you or someone else with access
to this repository has authorized Snyk to open upgrade PRs._

**For more information:** <img
src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiJiNGU0NzAwMS0yY2IyLTRkZjItYmZiZS0wMTJlNmYyOWNhYmIiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6ImI0ZTQ3MDAxLTJjYjItNGRmMi1iZmJlLTAxMmU2ZjI5Y2FiYiJ9fQ=="
width="0" height="0"/>

> - 🧐 [View latest project
report](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)
> - 👩‍💻 [Set who automatically gets
assigned](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr/)
> - 📜 [Customise PR
templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates?utm_source=&utm_content=fix-pr-template)
> - 🛠 [Adjust upgrade PR
settings](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr)
> - 🔕 [Ignore this dependency or unsubscribe from future upgrade
PRs](https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5/settings/integration?pkg&#x3D;tailwind-variants&amp;utm_source&#x3D;github&amp;utm_medium&#x3D;referral&amp;page&#x3D;upgrade-pr#auto-dep-upgrades)

[//]: #
'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"tailwind-variants","from":"0.3.0","to":"0.3.1"}],"env":"prod","hasFixes":false,"isBreakingChange":false,"isMajorUpgrade":false,"issuesToFix":[],"prId":"b4e47001-2cb2-4df2-bfbe-012e6f29cabb","prPublicId":"b4e47001-2cb2-4df2-bfbe-012e6f29cabb","packageManager":"npm","priorityScoreList":[],"projectPublicId":"f4bdc116-d05b-4937-96b5-b1f9a02872e5","projectUrl":"https://app.snyk.io/org/varant-zlai/project/f4bdc116-d05b-4937-96b5-b1f9a02872e5?utm_source=github&utm_medium=referral&page=upgrade-pr","prType":"upgrade","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":[],"type":"auto","upgrade":[],"upgradeInfo":{"versionsDiff":1,"publishedDate":"2025-01-18T20:27:59.252Z"},"vulns":[]}'

Co-authored-by: snyk-bot <[email protected]>
## Summary
Quick change to standardize scroll styles across the app

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- Style  
- Enhanced the appearance of scrollbars with updated styling and
customizable color options.

- Refactor  
- Simplified scroll behavior by replacing custom scrolling components
with standard, CSS-managed scroll containers across the layout and key
content areas.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

isEmpty is somewhat expensive operation as it needs a partial table
scan. For the most part in joins we allow for empty dataframes, so we
can optimize the common path.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **Refactor**
- Refined internal logic to streamline condition evaluations and
consolidated diagnostic messaging for more effective system monitoring.
These optimizations simplify internal processing while ensuring a
consistent user experience with no visible changes to public features.
Enhanced logging now provides improved insights into system operations
without impacting functionality. This update improves overall system
efficiency and clarity.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->

Co-authored-by: Thomas Chow <[email protected]>
## Summary
This PR wires up tiling support. Covers a few aspects:
* BigTable KV store changes to support tiling - we take requests for the
'_STREAMING' table for gets and puts using the TileKey thrift interface
and map to corresponding BT RowKey + timerange lookups. We've yanked out
event based support in the BT kv store. We're writing out data in the
Row + tile format documented here - [Option 1 - Tiles as Timestamped
Rows](https://docs.google.com/document/d/1wgzJVAkl5K1bBCr98WCZFiFeTTWqILdA3FTE7cz9Li4/edit?tab=t.0#bookmark=id.j54a5g8gj2m9).
* Add a Flag in the FlagStore to indicate if we're using Tiling / not.
Switched over the fetcher checks to use this instead of the prior
GrpByServingInfo.isTilingEnabled flag. Leverage this flag in Flink to
choose tiling / not. Set this flag to true in the GcpApi to always use
tiling.


## Checklist
- [X] Added Unit Tests
- [ ] Covered by existing CI
- [X] Integration tested - Tested on the Etsy side by running the job,
hitting some fetcher cli endpoints.
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced dynamic tiling capabilities for time series and streaming
data processing. This enhancement enables a configurable tiled data mode
that improves data retrieval granularity, processing consistency, and
overall query performance, resulting in more efficient and predictable
operations for end-users.
- Added new methods for constructing tile keys and row keys, enhancing
data management capabilities.
- Implemented flag-based control for enabling or disabling tiling in
various components, allowing for more flexible configurations.

- **Bug Fixes**
  - Corrected minor documentation errors in the FlagStore interface.

- **Tests**
- Expanded test coverage to validate new tiling functionalities and
ensure robustness in handling time series data.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: tchow-zlai <[email protected]>
Co-authored-by: Thomas Chow <[email protected]>
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Refactor**
- Removed the legacy transaction and risk analysis view to streamline
the user interface.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Couple of changes to get my feet wet with LayerChart and bring the chart
styles closer to what they were using ECharts.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced chart interactivity with improved tooltip behavior and legend
styling for a smoother, more engaging visualization experience.
- Added customizable options for axis configurations and highlighted
points, allowing for a more refined display of data trends.

- **Chores**
- Updated a core charting dependency to its latest version, contributing
to improved performance and stability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
	- Enhanced table output handling to support partitioned tables.
- Introduced configurable options for temporary storage and integration
settings, improving cloud-based table materialization.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->

---------

Co-authored-by: Thomas Chow <[email protected]>
@varant-zlai varant-zlai force-pushed the vz--refactor_join_simplify_bootstrap branch from 0e9ac8a to 0ee77c9 Compare February 11, 2025 01:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
spark/src/main/scala/ai/chronon/spark/DerivationJob.scala (2)

27-32: ⚠️ Potential issue

Add parameter validation and replace table name placeholders.

Validate input parameters and define actual table names.

   def fromJoin(join: api.Join, dateRange: PartitionRange): DerivationJob = {
+    require(join != null, "join cannot be null")
+    require(dateRange != null, "dateRange cannot be null")
+    require(join.derivations != null, "derivations cannot be null")
     val baseOutputTable = "TODO" // Output of the base Join pre-derivation
     val finalOutputTable = "TODO" // The actual output table
     val derivations = join.derivations.asScala
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  def fromJoin(join: api.Join, dateRange: PartitionRange): DerivationJob = {
+    require(join != null, "join cannot be null")
+    require(dateRange != null, "dateRange cannot be null")
+    require(join.derivations != null, "derivations cannot be null")
    val baseOutputTable = "TODO" // Output of the base Join pre-derivation
    val finalOutputTable = "TODO" // The actual output table
    val derivations = join.derivations.asScala
    new DerivationJob(baseOutputTable, finalOutputTable, derivations, dateRange)
  }

20-25: ⚠️ Potential issue

Add parameter validation and replace table name placeholders.

Validate input parameters and define actual table names.

   def fromGroupBy(groupBy: api.GroupBy, dateRange: PartitionRange): DerivationJob = {
+    require(groupBy != null, "groupBy cannot be null")
+    require(dateRange != null, "dateRange cannot be null")
+    require(groupBy.derivations != null, "derivations cannot be null")
     val baseOutputTable = "TODO" // Output of the base GroupBy pre-derivation
     val finalOutputTable = "TODO" // The actual output table
     val derivations = groupBy.derivations.asScala
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

  def fromGroupBy(groupBy: api.GroupBy, dateRange: PartitionRange): DerivationJob = {
    require(groupBy != null, "groupBy cannot be null")
    require(dateRange != null, "dateRange cannot be null")
    require(groupBy.derivations != null, "derivations cannot be null")
    val baseOutputTable = "TODO" // Output of the base GroupBy pre-derivation
    val finalOutputTable = "TODO" // The actual output table
    val derivations = groupBy.derivations.asScala
    new DerivationJob(baseOutputTable, finalOutputTable, derivations, dateRange)
  }
🧹 Nitpick comments (5)
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)

69-74: Use constants for null values.

Replace hardcoded nulls with constants.

-    val nulls = Seq("null", "Null", "NULL")
+    private val NULL_VALUES = Seq("null", "Null", "NULL")
+    def generateSkewFilterSql(key: String, values: Seq[String]): String = {
+      val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(NULL_VALUES.contains).mkString(", ")})")
+      val nullFilters = if (values.exists(NULL_VALUES.contains)) Some(s"$key IS NOT NULL") else None
spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1)

72-119: Consider optimizing join operations.

The fold operation with joins could be optimized using broadcast joins for small tables.

spark/src/main/scala/ai/chronon/spark/Join.scala (3)

301-328: Remove commented code.

Good refactoring moving join part logic to a dedicated class. However, remove the commented-out code before finalizing the PR.

-                // val df =
-                // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode)
-                //  .map(df => joinPart -> df)
-

301-304: Remove commented out code.

Delete the commented out code as it's no longer needed.

-                // val df =
-                // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode)
-                //  .map(df => joinPart -> df)
-

301-328: Remove commented out code.

Since the code has been refactored to use JoinPartJob, the commented out code can be safely removed.

Apply this diff to remove the commented out code:

-                // val df =
-                // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode)
-                //  .map(df => joinPart -> df)
-
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 531477f and 0ee77c9.

📒 Files selected for processing (9)
  • api/thrift/orchestration.thrift (1 hunks)
  • orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/DerivationJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Join.scala (2 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinBase.scala (2 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (5 hunks)
  • spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala
  • spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: spark_tests
  • GitHub Check: non_spark_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: enforce_triggered_workflows
🔇 Additional comments (11)
spark/src/main/scala/ai/chronon/spark/JoinBase.scala (6)

174-175: LGTM! Good refactoring.

Moving bootstrap logic to a dedicated class improves modularity.


296-296: LGTM! Good refactoring.

Moving small mode determination to JoinUtils improves reusability.


174-175: LGTM! Good refactoring.

Moving bootstrap logic to a dedicated class improves modularity.


296-296: LGTM! Good refactoring.

Centralizing small mode determination logic in JoinUtils improves reusability.


174-175: LGTM! Good refactoring.

Moving bootstrap logic to a dedicated class improves code organization.


296-296: LGTM! Good refactoring.

Moving small mode determination to JoinUtils improves code reusability.

spark/src/main/scala/ai/chronon/spark/Join.scala (5)

222-223: LGTM! Consistent refactoring.

Changes align with the bootstrap logic refactoring in JoinBase.scala.


222-223: LGTM! Consistent refactoring.

Moving bootstrap logic to a dedicated class is consistent with changes in JoinBase.scala.


305-328: LGTM! Good refactoring.

Moving join part logic to a dedicated class with proper context improves modularity.


1-1: Add missing items before merge.

As indicated in the PR description, please ensure the following are completed before merge:

  1. Add unit tests
  2. Ensure CI coverage
  3. Perform integration testing
  4. Update documentation

222-223: LGTM! Good refactoring.

Moving bootstrap logic to a dedicated class improves code organization.

outputTable: String,
derivations: Seq[api.Derivation],
dateRange: PartitionRange) {
def run(): Unit = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Implement the run method.

Empty implementation needs to be completed.

Comment on lines 24 to 28
val timeProjection = if (source.dataModel == Events) {
Seq(Constants.TimeColumn -> Option(source.query).map(_.timeColumn).orNull)
} else {
Seq()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add null check for source.

Prevent potential NPE.

-    val timeProjection = if (source.dataModel == Events) {
+    val timeProjection = if (source != null && source.dataModel == Events) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
val timeProjection = if (source.dataModel == Events) {
Seq(Constants.TimeColumn -> Option(source.query).map(_.timeColumn).orNull)
} else {
Seq()
}
val timeProjection = if (source != null && source.dataModel == Events) {
Seq(Constants.TimeColumn -> Option(source.query).map(_.timeColumn).orNull)
} else {
Seq()
}

Comment on lines 38 to 49
def run(): Unit = {
// Runs the bootstrap query and produces an output table specific to the `left` side of the Join
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Implement the run method.

Empty implementation needs to be completed.

Comment on lines 108 to 118
// TODO: allow customization of deduplication logic
.dropDuplicates(part.keys(join, tableUtils.partitionColumn).toArray)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Address TODO about deduplication logic customization.

Implement customizable deduplication strategy.


🏁 Script executed:

#!/bin/bash
# Check if there are any existing customization patterns in the codebase
rg -A 5 "deduplication.*logic" 

Length of output: 826


Custom deduplication logic is not implemented.

  • The TODO remains, and the code still hard-codes .dropDuplicates(...) without a customization hook.

Comment on lines 272 to 276
struct SourceWithFilter {
1: optional api.Source source
2: optional map<string,list<string>> excludeKeys
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add documentation for SourceWithFilter struct.

Add Thrift documentation comments explaining the purpose and usage of this struct.

+/**
+ * Represents a source with optional key exclusion filters.
+ * @param source The source to filter
+ * @param excludeKeys Map of column names to values that should be excluded
+ */
 struct SourceWithFilter {
 }

Committable suggestion skipped: line range outside the PR's diff.

tchow-zlai and others added 10 commits February 10, 2025 19:24
- This is a better failure mode, we don't want to continue if theres'
something that's happening in the ananalysis phase.

## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
	- Refined error handling mechanism in join computation process
	- Improved exception propagation during unexpected errors

The changes focus on streamlining error management with a more direct
approach to handling unexpected exceptions during join operations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->

---------

Co-authored-by: Thomas Chow <[email protected]>
## Summary

- bulked out eval to run sources inside join / group_by etc. 

- removed need for separate gateway setup and maintenance.

- added support for sampling dependent tables to local_warehouse.

- deleted some dead code.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [x] Integration tested (on etsy confs)
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Bug Fixes**
  - Improved error feedback clarity during data sampling.

- **New Features**
  - Increased data sampling limits for improved performance.
  - Enhanced SQL query handling with new date filtering conditions.

- **Refactor**
- Streamlined SQL query generation for table scans, ensuring valid
queries under various conditions.
- Deprecated outdated sampling functionality to enhance overall
maintainability.

- **Chores**
- Disabled unnecessary operations in the build and upload script for
Google Cloud Storage.

- **Style**
- Added logging for improved traceability of filtering conditions in
DataFrame scans.

- **Tests**
  - Removed unit tests for the Flow and Node classes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced data processing by introducing new configuration options for
writing data, including support for Parquet as the intermediate format
and enabling list inference during write operations.
- Expanded selection of fields in purchase events with the addition of
`bucket_rand`.
- Introduced a new aggregation to calculate the last 15 purchase prices,
utilizing the newly added `bucket_rand` field.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->

---------

Co-authored-by: Thomas Chow <[email protected]>
## Summary
^^^

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Enhanced logging now delivers color-coded outputs and adjusted log
levels for clearer visibility.
- Upgraded service versioning supports stable, production-ready
deployments.

- **Chores**
- Modernized the build and deployment pipeline to improve artifact
handling.
- Refined dependency management to bolster advanced logging
capabilities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Adds support for creating a new `.bazelrc.local` file specifying custom
build/test bazel options which can be used for passing gcloud auth
credentials

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Updated the build configuration to optionally load a user-specific
settings file, replacing the automatic use of preset credentials.
- **Documentation**
- Enhanced guidance with a new section detailing steps for setting up
personal authentication credentials.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Modified our github workflow to run scalaFmt checks using bazel instead
of sbt and deleted the build.sbt file as it's no longer needed now.

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Streamlined build and continuous integration setups, transitioning
away from legacy tooling.
- Modernized internal infrastructure for improved consistency and
stability.

- **Refactor / Style**
- Enhanced code readability with comprehensive cosmetic and
documentation updates.
- Unified formatting practices across the codebase to support future
maintainability.
- Adjusted formatting of comments and code blocks for improved clarity
without altering functionality.

- **Tests**
- Reformatted test suites for clarity and consistency while preserving
all functional behaviors.
- Improved formatting in various test cases and methods for better
readability without altering functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary

Based on Slack
[discussion](https://zipline-2kh4520.slack.com/archives/C0880ECQ0EN/p1739304132253249)

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update


<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Introduced an optional attribute to enhance node classification with
more detailed physical characteristics for improved metadata
representation.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Sean Lynch <[email protected]>
## Summary
Updated zpush script with bazel scalafmt in our dev notes.

## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
  - Enhanced guidelines for formatting and pushing Scala code.
- Replaced previous procedures with an updated method featuring detailed
error notifications.
  - Clarified the need for quoting multi-word commit messages.
- Adjusted the ordering of remote connectivity instructions for improved
clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@varant-zlai varant-zlai force-pushed the vz--refactor_join_simplify_bootstrap branch from bd4e828 to b4bd285 Compare February 12, 2025 00:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)

24-28: ⚠️ Potential issue

Add null check for source.

Prevent potential NPE by checking if source is null before accessing dataModel.

-    val timeProjection = if (source.dataModel == Events) {
+    val timeProjection = if (source != null && source.dataModel == Events) {
🧹 Nitpick comments (5)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (2)

446-470: Add input validation for skewKeys.

Consider validating that skewKeys values are non-empty when present.

 def skewFilter(keys: Option[Seq[String]] = None,
                skewKeys: Option[Map[String, Seq[String]]],
                leftKeyCols: Seq[String],
                joiner: String = " OR "): Option[String] = {
+    require(skewKeys.forall(_.values.forall(_.nonEmpty)), "skewKeys values must not be empty")
     skewKeys.map { keysMap =>

521-535: Extract magic numbers into constants.

The threshold count appears in multiple log messages. Consider extracting it to avoid repetition.

+  private val SmallModeLogMsg = "Counted %d rows, running join in small mode."
+  private val NormalModeLogMsg = "Counted greater than %d rows, proceeding with normal computation."
   def runSmallMode(tableUtils: TableUtils, leftDf: DataFrame): Boolean = {
     if (tableUtils.smallModelEnabled) {
       val thresholdCount = leftDf.limit(Some(tableUtils.smallModeNumRowsCutoff + 1).get).count()
       val result = thresholdCount <= tableUtils.smallModeNumRowsCutoff
       if (result) {
-        logger.info(s"Counted $thresholdCount rows, running join in small mode.")
+        logger.info(SmallModeLogMsg.format(thresholdCount))
       } else {
-        logger.info(
-          s"Counted greater than ${tableUtils.smallModeNumRowsCutoff} rows, proceeding with normal computation.")
+        logger.info(NormalModeLogMsg.format(tableUtils.smallModeNumRowsCutoff))
       }
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)

47-49: Enhance error message for empty DataFrame.

Add more context to help debug empty result sets.

-      throw new RuntimeException(s"Query produced 0 rows in range $range.")
+      throw new RuntimeException(s"Query for source ${source.table} produced 0 rows in range $range. Please verify the source data and filters.")
spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1)

191-195: Enhance error handling with specific error types.

Current error handling is too generic.

-    } catch {
-      case e: Exception =>
-        logger.error(s"Error while processing groupBy: ${joinPart.groupBy.getMetaData.getName}")
-        throw e
+    } catch {
+      case e: IllegalArgumentException =>
+        logger.error(s"Invalid arguments while processing groupBy: ${joinPart.groupBy.getMetaData.getName}", e)
+        throw e
+      case e: Exception =>
+        logger.error(s"Unexpected error while processing groupBy: ${joinPart.groupBy.getMetaData.getName}", e)
+        throw new RuntimeException(s"Bootstrap computation failed: ${e.getMessage}", e)
spark/src/main/scala/ai/chronon/spark/Join.scala (1)

343-345: Remove commented code.

Clean up commented code that's no longer needed.

-                // val df =
-                // computeRightTable(unfilledLeftDf, joinPart, leftRange, leftTimeRangeOpt, bloomFilterOpt, runSmallMode)
-                //  .map(df => joinPart -> df)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between bd4e828 and b4bd285.

📒 Files selected for processing (9)
  • api/thrift/orchestration.thrift (1 hunks)
  • orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/DerivationJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Join.scala (4 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinBase.scala (2 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (5 hunks)
  • spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • api/thrift/orchestration.thrift
  • orchestration/src/main/scala/ai/chronon/orchestration/Bootstrap.scala
  • spark/src/main/scala/ai/chronon/spark/DerivationJob.scala
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: spark_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: non_spark_tests
  • GitHub Check: enforce_triggered_workflows
🔇 Additional comments (11)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (4)

308-313: LGTM! Good refactoring.

The parameter change from joinConf to leftDataModel improves method focus and reduces coupling.


495-500: LGTM! Robust null handling.

The method correctly handles both null and non-null cases in the SQL filter generation.


502-519: LGTM! Good early returns.

The method has clear early return conditions and proper null checks.


537-542: LGTM! Clean conversion utility.

The method is focused and handles the conversion well.

spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)

69-74: LGTM! Robust SQL filter generation.

The SQL generation handles both null and non-null cases correctly.

spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (2)

117-118: Implement customizable deduplication logic.

The TODO indicates a need for flexible deduplication strategies.

Would you like me to propose a design for customizable deduplication logic?


148-177: LGTM! Robust external field padding.

The implementation correctly handles both contextual and non-contextual fields.

spark/src/main/scala/ai/chronon/spark/JoinPartJob.scala (1)

289-302: LGTM! Comprehensive join logic.

The pattern matching covers all combinations of data models and accuracies correctly.

spark/src/main/scala/ai/chronon/spark/JoinBase.scala (2)

174-175: LGTM! Clean bootstrap job integration.

Good modularization of bootstrap logic into a separate job class.


295-295: LGTM! Centralized small mode determination.

Good refactoring to use JoinUtils for consistent small mode logic.

spark/src/main/scala/ai/chronon/spark/Join.scala (1)

347-370: LGTM! Good refactoring to improve modularity.

The introduction of JoinPartJobContext and JoinPartJob improves code organization and maintainability.

JoinUtils.skewFilter(Option(joinPart.rightToLeft.values.toSeq), skewKeys, joinPart.rightToLeft.values.toSeq)
// this is the second time we apply skew filter - but this filters only on the keys
// relevant for this join part.
println("leftSkewFilter: " + leftSkewFilter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace println with logger.

Use proper logging instead of println for debugging.

-    println("leftSkewFilter: " + leftSkewFilter)
+    logger.debug(s"leftSkewFilter: $leftSkewFilter")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
println("leftSkewFilter: " + leftSkewFilter)
logger.debug(s"leftSkewFilter: $leftSkewFilter")

Comment on lines 325 to 334
// if the join part contains ChrononRunDs macro, then we need to make sure the join is for a single day
val selects = Option(joinPart.groupBy.sources.toScala.map(_.query.selects).map(_.toScala))
if (
selects.isDefined && selects.get.nonEmpty && selects.get.exists(selectsMap =>
Option(selectsMap).isDefined && selectsMap.values.exists(_.contains(Constants.ChrononRunDs)))
) {
assert(
leftRange.isSingleDay,
s"Macro ${Constants.ChrononRunDs} is only supported for single day join, current range is $leftRange")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicated code block.

This block is an exact duplicate of the code above at lines 318-324.

-                // if the join part contains ChrononRunDs macro, then we need to make sure the join is for a single day
-                val selects = Option(joinPart.groupBy.sources.toScala.map(_.query.selects).map(_.toScala))
-                if (
-                  selects.isDefined && selects.get.nonEmpty && selects.get.exists(selectsMap =>
-                    Option(selectsMap).isDefined && selectsMap.values.exists(_.contains(Constants.ChrononRunDs)))
-                ) {
-                  assert(
-                    leftRange.isSingleDay,
-                    s"Macro ${Constants.ChrononRunDs} is only supported for single day join, current range is $leftRange")
-                }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// if the join part contains ChrononRunDs macro, then we need to make sure the join is for a single day
val selects = Option(joinPart.groupBy.sources.toScala.map(_.query.selects).map(_.toScala))
if (
selects.isDefined && selects.get.nonEmpty && selects.get.exists(selectsMap =>
Option(selectsMap).isDefined && selectsMap.values.exists(_.contains(Constants.ChrononRunDs)))
) {
assert(
leftRange.isSingleDay,
s"Macro ${Constants.ChrononRunDs} is only supported for single day join, current range is $leftRange")
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
spark/src/main/scala/ai/chronon/spark/SourceJob.scala (2)

14-17: Enhance class documentation.

Add parameter descriptions and example usage.

 /*
 Runs and materializes a `Source` for a given `dateRange`. Used in the Join computation flow to first compute the Source,
 then each join may have a further Bootstrap computation to produce the left side to effectively use.
+ 
+ @param sourceWithFilter Source configuration with skew filtering options
+ @param tableUtils Utility for table operations
+ @param range Partition range to process
+ 
+ Example:
+ {{{
+ val job = new SourceJob(source, utils, range)
+ job.run()
+ }}}
  */

51-53: Resolve design decision and add error handling.

The comment suggests uncertainty about parameter design. Also, add error handling for digest generation.

-      f"${source.table}_${ThriftJsonCodec.md5Digest(sourceWithFilter)}" // Or should we pass this in as an arg?
+      f"${source.table}_${Try(ThriftJsonCodec.md5Digest(sourceWithFilter))
+        .getOrElse(throw new RuntimeException("Failed to generate output name digest"))}"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between b4bd285 and 2c40253.

📒 Files selected for processing (3)
  • spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala (1 hunks)
  • spark/src/main/scala/ai/chronon/spark/Join.scala (4 hunks)
  • spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • spark/src/main/scala/ai/chronon/spark/BootstrapJob.scala
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: non_spark_tests
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: spark_tests
  • GitHub Check: enforce_triggered_workflows
🔇 Additional comments (4)
spark/src/main/scala/ai/chronon/spark/Join.scala (3)

325-334: Remove duplicated code block.

This block is identical to lines 318-324.


347-369: LGTM! Good modularization with JoinPartJob.

The refactoring improves code organization by encapsulating join part processing logic.


354-365: LGTM! Clear configuration handling.

Good separation of concerns with skew handling and bootstrap table configurations.

spark/src/main/scala/ai/chronon/spark/SourceJob.scala (1)

24-28: Add null check for source.

Prevent potential NPE.

-    val timeProjection = if (source.dataModel == Events) {
+    val timeProjection = if (source != null && source.dataModel == Events) {

Comment on lines 68 to 73
def generateSkewFilterSql(key: String, values: Seq[String]): String = {
val nulls = Seq("null", "Null", "NULL")
val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})")
val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None
(nonNullFilters ++ nullFilters).mkString(" AND ")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Prevent SQL injection in filter generation.

Sanitize input values before using in SQL.

   def generateSkewFilterSql(key: String, values: Seq[String]): String = {
     val nulls = Seq("null", "Null", "NULL")
-    val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})")
+    val sanitizedValues = values.filterNot(nulls.contains).map(_.replaceAll("[^a-zA-Z0-9_]", ""))
+    val nonNullFilters = Some(s"$key NOT IN (${sanitizedValues.mkString(", ")})")
     val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None
     (nonNullFilters ++ nullFilters).mkString(" AND ")
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def generateSkewFilterSql(key: String, values: Seq[String]): String = {
val nulls = Seq("null", "Null", "NULL")
val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})")
val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None
(nonNullFilters ++ nullFilters).mkString(" AND ")
}
def generateSkewFilterSql(key: String, values: Seq[String]): String = {
val nulls = Seq("null", "Null", "NULL")
val sanitizedValues = values.filterNot(nulls.contains).map(_.replaceAll("[^a-zA-Z0-9_]", ""))
val nonNullFilters = Some(s"$key NOT IN (${sanitizedValues.mkString(", ")})")
val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None
(nonNullFilters ++ nullFilters).mkString(" AND ")
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (4)

447-470: Add input validation for empty sequences.

Consider adding validation for empty sequences in keys and leftKeyCols parameters.

 def skewFilter(keys: Option[Seq[String]] = None,
                skewKeys: Option[Map[String, Seq[String]]],
                leftKeyCols: Seq[String],
                joiner: String = " OR "): Option[String] = {
+  require(leftKeyCols.nonEmpty, "leftKeyCols cannot be empty")
+  keys.foreach(k => require(k.nonEmpty, "keys cannot be empty when provided"))
   skewKeys.map { keysMap =>

494-499: Extract null values as constants.

Move magic strings for null values to constants.

+  private val NullValues = Set("null", "Null", "NULL")
   private def generateSkewFilterSql(key: String, values: Seq[String]): String = {
-    val nulls = Seq("null", "Null", "NULL")
     val nonNullFilters = Some(s"$key NOT IN (${values.filterNot(nulls.contains).mkString(", ")})")
-    val nullFilters = if (values.exists(nulls.contains)) Some(s"$key IS NOT NULL") else None
+    val nullFilters = if (values.exists(NullValues.contains)) Some(s"$key IS NOT NULL") else None

501-518: Add method documentation.

Add ScalaDoc to explain the purpose and parameters of findUnfilledRecords.

+  /**
+   * Identifies unfilled records in a bootstrap DataFrame based on covering sets.
+   *
+   * @param bootstrapDfWithStats The bootstrap DataFrame with statistics
+   * @param coveringSets The sequence of covering sets to check against
+   * @return Option[DfWithStats] containing unfilled records if any exist
+   */
   def findUnfilledRecords(bootstrapDfWithStats: DfWithStats, coveringSets: Seq[CoveringSet]): Option[DfWithStats] = {

520-534: Extract magic numbers in logging.

Move the cutoff value to a constant to avoid repetition.

   def runSmallMode(tableUtils: TableUtils, leftDf: DataFrame): Boolean = {
     if (tableUtils.smallModelEnabled) {
-      val thresholdCount = leftDf.limit(Some(tableUtils.smallModeNumRowsCutoff + 1).get).count()
+      val cutoff = tableUtils.smallModeNumRowsCutoff
+      val thresholdCount = leftDf.limit(Some(cutoff + 1).get).count()
       val result = thresholdCount <= tableUtils.smallModeNumRowsCutoff
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 2c40253 and 130b318.

📒 Files selected for processing (1)
  • spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: non_spark_tests
  • GitHub Check: spark_tests
  • GitHub Check: enforce_triggered_workflows
🔇 Additional comments (2)
spark/src/main/scala/ai/chronon/spark/JoinUtils.scala (2)

309-346: LGTM! Good refactoring.

Simplified method signature by requiring only the necessary data model information.


536-540: LGTM! Clean implementation.

Simple and effective conversion from Java to Scala collections.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.