-
Notifications
You must be signed in to change notification settings - Fork 8
[*WIP*] Controller - Agent interface #301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
## Summary ## Checklist - [x] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new `Model` function for constructing machine learning model objects with customizable parameters. - Added a `ModelType` class to encapsulate different model types such as XGBoost and PyTorch. - New model configuration `quickstart.test.v1` for defining output schemas and model parameters. - **Configurations** - New JSON configuration files for the `quickstart.test.v1` model, specifying output schema and source details. - **Tests** - Added tests for setting up training datasets and integrating with machine learning models. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Chewy Shaw <[email protected]>
…roup across 1 directory" (#35) Reverts #23 This broke our script. When running: `docker-compose -f docker-init/compose.yaml up --build` I get the error: ``` 2.676 ERROR: No matching distribution found for numpy==1.22.0 ------ 2.675 ERROR: Could not find a version that satisfies the requirement numpy==1.22.0 (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.3, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5, 1.20.0, 1.20.1, 1.20.2, 1.20.3, 1.21.0, 1.21.1, 1.21.2, 1.21.3, 1.21.4, 1.21.5, 1.21.6) 2.676 ERROR: No matching distribution found for numpy==1.22.0 ------ failed to solve: process "/bin/sh -c pip3 install --upgrade pip; pip3 install -r requirements.txt" did not complete successfully: exit code: 1 ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated the version of the `numpy` package to improve compatibility and performance. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary We now inject environment variables from the docker container instead of hard coding them with the build. I also updated the CORS config to allow requests from the local dev port (5173) and the docker node js server port (3000) ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added CORS configuration to enhance API accessibility. - Introduced new environment variables `SERVER_API_URL` and `API_BASE_URL` for frontend service connectivity. - **Bug Fixes** - Updated the base API URL logic for better environment handling. - **Chores** - Removed outdated environment files and scripts related to Docker builds. - Updated `.gitignore` for improved file management in frontend development. - Modified the frontend build command for consistency. - **Documentation** - Adjusted the healthcheck configuration for the frontend service in Docker. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated the Docker environment to support Python 3.8 for enhanced compatibility. - **Bug Fixes** - Upgraded NumPy from version 1.21.6 to 1.22.0 to address security concerns. - **Chores** - Modified the startup script to ensure it uses Python 3.8 for executing relevant scripts. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Chewy Shaw <[email protected]>
## Summary A model to use in the POC. It uses the join as a source. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new risk transaction model for enhanced data analysis and risk assessment. - Added structured JSON schema for the risk transaction model, including metadata and join configurations. - **Documentation** - Updated documentation to reflect the new model and its functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Chewy Shaw <[email protected]>
## Summary First frontend milestone. I'll put all my work so far into this PR. Some of the features: - List out all of the models on the Models page - Wireframes for coming soon stuff - Basic search functionality - Left nav and breadcrumb nav - Test observability page with interactive heatmap ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced several new components including `BreadcrumbNav`, `NavigationBar`, `ComingSoonPage`, and various command components for enhanced user interaction. - Added a breadcrumb navigation system to improve site navigation. - Implemented a search functionality to retrieve models and time series data. - Added a new dialog interface with customizable components for improved user experience. - **Enhancements** - Updated model data representation with additional columns in the models table. - Improved dialog components for better user experience. - **Bug Fixes** - Updated API calls in tests and components for better clarity and functionality. - **Documentation** - Enhanced component documentation for better understanding and usage. - **Chores** - Updated dependencies for improved performance and stability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Piyush Narang <[email protected]>
## Summary Downgrade Play to 2.x. The version of Play we had required Scala 2.13. This ends up being painful to work with when we start introducing dependencies from the 'hub' play module to other modules like api / online (which we need for the Dynamodb hookup). Sbt isn't very happy when we're trying to hook up these builds across scala versions when hub is defaulting to 2.13 and the rest of the project on 2.12. To not bang at this further, I'm downgrading to a play version on scala 2.12 and we can revisit when we're ready to bump to Scala 2.13. ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced mock data generation for time series, allowing for a broader range of generated values. - Updated Java options in the Docker initialization script to prevent execution errors. - **Bug Fixes** - Improved error handling in test cases by correctly accessing the successful results from decoded responses. - **Chores** - Updated project dependencies and plugins to maintain compatibility and improve build configuration, including the Play Framework plugin and several SBT plugins. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Bumps the npm_and_yarn group with 2 updates in the /frontend directory: [cookie](https://github.com/jshttp/cookie) and [@sveltejs/kit](https://github.com/sveltejs/kit/tree/HEAD/packages/kit). Updates `cookie` from 0.6.0 to 0.7.2 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/jshttp/cookie/releases">cookie's releases</a>.</em></p> <blockquote> <h2>v0.7.2</h2> <p><strong>Fixed</strong></p> <ul> <li>Fix object assignment of <code>hasOwnProperty</code> (<a href="https://github.com/jshttp/cookie/issues/177">#177</a>) bc38ffd</li> </ul> <p><a href="https://github.com/jshttp/cookie/compare/v0.7.1...v0.7.2">https://github.com/jshttp/cookie/compare/v0.7.1...v0.7.2</a></p> <h2>0.7.1</h2> <p><strong>Fixed</strong></p> <ul> <li>Allow leading dot for domain (<a href="https://github.com/jshttp/cookie/issues/174">#174</a>) <ul> <li>Although not permitted in the spec, some users expect this to work and user agents ignore the leading dot according to spec</li> </ul> </li> <li>Add fast path for <code>serialize</code> without options, use <code>obj.hasOwnProperty</code> when parsing (<a href="https://github.com/jshttp/cookie/issues/172">#172</a>)</li> </ul> <p><a href="https://github.com/jshttp/cookie/compare/v0.7.0...v0.7.1">https://github.com/jshttp/cookie/compare/v0.7.0...v0.7.1</a></p> <h2>0.7.0</h2> <ul> <li>perf: parse cookies ~10% faster (<a href="https://github.com/jshttp/cookie/issues/144">#144</a> by <a href="https://github.com/kurtextrem"><code>@kurtextrem</code></a> and <a href="https://github.com/jshttp/cookie/issues/170">#170</a>)</li> <li>fix: narrow the validation of cookies to match RFC6265 (<a href="https://github.com/jshttp/cookie/issues/167">#167</a> by <a href="https://github.com/bewinsnw"><code>@bewinsnw</code></a>)</li> <li>fix: add <code>main</code> to <code>package.json</code> for rspack (<a href="https://github.com/jshttp/cookie/issues/166">#166</a> by <a href="https://github.com/proudparrot2"><code>@proudparrot2</code></a>)</li> </ul> <p><a href="https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.0">https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/jshttp/cookie/commit/d19eaa1a2bb9ca43ac0951edd852ba4e88e410e0"><code>d19eaa1</code></a> 0.7.2</li> <li><a href="https://github.com/jshttp/cookie/commit/bc38ffd0eae716b199236dda061d0bdc74192dd3"><code>bc38ffd</code></a> Fix object assignment of <code>hasOwnProperty</code> (<a href="https://github.com/jshttp/cookie/issues/177">#177</a>)</li> <li><a href="https://github.com/jshttp/cookie/commit/cf4658f492c5bd96aeaf5693c3500f8495031014"><code>cf4658f</code></a> 0.7.1</li> <li><a href="https://github.com/jshttp/cookie/commit/6a8b8f5a49af7897b98ebfb29a1c4955afa3d33e"><code>6a8b8f5</code></a> Allow leading dot for domain (<a href="https://github.com/jshttp/cookie/issues/174">#174</a>)</li> <li><a href="https://github.com/jshttp/cookie/commit/58015c0b93de0b63db245cfdc5a108e511a81ad0"><code>58015c0</code></a> Remove more code and perf wins (<a href="https://github.com/jshttp/cookie/issues/172">#172</a>)</li> <li><a href="https://github.com/jshttp/cookie/commit/ab057d6c06b94a7b1e3358e69a685ae49c97b627"><code>ab057d6</code></a> 0.7.0</li> <li><a href="https://github.com/jshttp/cookie/commit/5f02ca87688481dbcf155e49ca8b61732f30e542"><code>5f02ca8</code></a> Migrate history to GitHub releases</li> <li><a href="https://github.com/jshttp/cookie/commit/a5d591ce8447dd63821779724f96ad3c774c8579"><code>a5d591c</code></a> Migrate history to GitHub releases</li> <li><a href="https://github.com/jshttp/cookie/commit/51968f94b5e820adeceef505539fa193ffe2d105"><code>51968f9</code></a> Skip isNaN</li> <li><a href="https://github.com/jshttp/cookie/commit/9e7ca51ade4b325307eedd6b4dec190983e9e2cc"><code>9e7ca51</code></a> perf(parse): cache length, return early (<a href="https://github.com/jshttp/cookie/issues/144">#144</a>)</li> <li>Additional commits viewable in <a href="https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.2">compare view</a></li> </ul> </details> <details> <summary>Maintainer changes</summary> <p>This version was pushed to npm by <a href="https://www.npmjs.com/~blakeembrey">blakeembrey</a>, a new releaser for cookie since your current version.</p> </details> <br /> Updates `@sveltejs/kit` from 2.5.28 to 2.6.2 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/sveltejs/kit/releases"><code>@sveltejs/kit</code>'s releases</a>.</em></p> <blockquote> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.6.2</h2> <h3>Patch Changes</h3> <ul> <li>chore(deps): update dependency cookie to ^0.7.0 (<a href="https://github.com/sveltejs/kit/pull/12746">#12746</a>)</li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.6.1</h2> <h3>Patch Changes</h3> <ul> <li>fix: better error message when calling push/replaceState before router is initialized (<a href="https://github.com/sveltejs/kit/pull/11968">#11968</a>)</li> </ul> <h2><code>@sveltejs/kit</code><a href="https://github.com/2"><code>@2</code></a>.6.0</h2> <h3>Minor Changes</h3> <ul> <li>feat: support typed arrays in <code>load</code> functions (<a href="https://github.com/sveltejs/kit/pull/12716">#12716</a>)</li> </ul> <h3>Patch Changes</h3> <ul> <li>fix: open a new tab for <code><form target="_blank"></code> and `<!-- raw HTML omitted --> submissions (<a href="https://github.com/sveltejs/kit/pull/11936">#11936</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/sveltejs/kit/blob/main/packages/kit/CHANGELOG.md"><code>@sveltejs/kit</code>'s changelog</a>.</em></p> <blockquote> <h2>2.6.2</h2> <h3>Patch Changes</h3> <ul> <li>chore(deps): update dependency cookie to ^0.7.0 (<a href="https://github.com/sveltejs/kit/pull/12746">#12746</a>)</li> </ul> <h2>2.6.1</h2> <h3>Patch Changes</h3> <ul> <li>fix: better error message when calling push/replaceState before router is initialized (<a href="https://github.com/sveltejs/kit/pull/11968">#11968</a>)</li> </ul> <h2>2.6.0</h2> <h3>Minor Changes</h3> <ul> <li>feat: support typed arrays in <code>load</code> functions (<a href="https://github.com/sveltejs/kit/pull/12716">#12716</a>)</li> </ul> <h3>Patch Changes</h3> <ul> <li>fix: open a new tab for <code><form target="_blank"></code> and `<!-- raw HTML omitted --> submissions (<a href="https://github.com/sveltejs/kit/pull/11936">#11936</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/sveltejs/kit/commit/ec84888f487ed131d999687c2a796890cc0ee9b6"><code>ec84888</code></a> Version Packages (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12751">#12751</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/ef38c385f4f78a524781f80b1f16da1a35aa94e2"><code>ef38c38</code></a> chore(deps): update dependency cookie to ^0.7.0 (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12746">#12746</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/06936ae37e697f41e8e30a764893bd1af1cb609d"><code>06936ae</code></a> Version Packages (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12725">#12725</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/e9ed77259749428b64569f67da550a36057820d8"><code>e9ed772</code></a> fix: better error message when calling push/replaceState before router is ini...</li> <li><a href="https://github.com/sveltejs/kit/commit/b74d79691bb85a25cb560fb1d15edc632c7ac0b4"><code>b74d796</code></a> Version Packages (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12707">#12707</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/5b40b04608023a3fcb6c601a5f2d36485ce07196"><code>5b40b04</code></a> feat: support typed arrays in <code>load</code> functions (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12716">#12716</a>)</li> <li><a href="https://github.com/sveltejs/kit/commit/a233f53f28fcb4c3ea63d2faf156fba09a18456c"><code>a233f53</code></a> fix: allow native form submission for <code>\<form target="_blank"></code> and `<button f...</li> <li><a href="https://github.com/sveltejs/kit/commit/6ea7abbc2f66e46cb83ff95cd459a5f548cb7e1e"><code>6ea7abb</code></a> chore: bump eslint-config (<a href="https://github.com/sveltejs/kit/tree/HEAD/packages/kit/issues/12682">#12682</a>)</li> <li>See full diff in <a href="https://github.com/sveltejs/kit/commits/@sveltejs/[email protected]/packages/kit">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/zipline-ai/chronon/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Summary Add a DynamoDB KV store implementation that we can leverage for our Stripe monitoring PoC. This PR only includes the KV store implementation + tests. Wiring this up to our Play service to create the relevant tables + start storing the actual model metadata + time series data will be part of a follow up. Our existing Kv store api covers some aspects of what we need on the monitoring front but has a few limitations: * No list api - we need this to enumerate all the datasets in tables like the Model table. (We can squeeze this into the get call but it feels a bit contrived as we need to use an arbitrary fixed partition key (e.g. "1") and then issue a range query to get the entries. Leveraging dynamo's scan with a separate 'list' KV store api felt cleaner there. * Create doesn't take params - we do need to provide params to our Dynamo create call where we say if the table is 'sorted' or not, what the read / write capacity units are etc. I've extended the create api to include these params. Open to feedback here to improve / tweak this. We can consider the kv store api to be a bit in flux and once we integrate a couple more cloud provider KV stores (big table, cosmos db) we'll get to something more generic and settled. ## Checklist - [X] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced a new key-value store interface using Amazon DynamoDB, allowing for data creation, retrieval, and pagination. - Added support for listing datasets with new request and response types in the KVStore API. - Enhanced data retrieval capabilities with optional time range parameters. - Expanded metrics tracking with a new environment type for the key-value store. - **Bug Fixes** - Improved error handling during table creation and data operations. - **Tests** - Implemented a comprehensive test suite for the new DynamoDB key-value store functionalities, including table creation, data insertion, pagination, and time series data handling. - **Chores** - Updated build configuration to integrate AWS support while preserving existing structures. - Added a new entry to the .gitignore file for generated test files. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary [PR Walkthrough Video](https://www.loom.com/share/8d0339c11f0345a9ae3913035c70b66c) Features added: - Model performance and drift chart - Tabbed feature monitoring section - One chart per groupby (with lines for each feature) - Clicking on a data point in groupby charts opens a sidebar with basic info - Date changer with presets (syncs with url parameter `date-range` - Zoom chart by clicking and dragging on chart area (and sync all charts) Not yet implemented: - UX match to figma (Eugene finalizing designs). Typography, exact design of UI elements, padding, etc. - Proper sidebar UX treatment ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a `CollapsibleSection` component for improved UI organization. - Added a dropdown menu in the navigation bar for enhanced user interaction. - Implemented a customizable scrollbar and scroll area for better content navigation. - Added support for date range selection and chart options in observability pages. - Enhanced charting capabilities with new EChart components and tooltips. - **Bug Fixes** - Resolved issues related to the rendering of dropdown menu items and their states. - **Documentation** - Updated documentation to reflect new components and functionalities. - **Style** - Improved styling for various components including dropdown menus and select inputs. - **Tests** - Added tests for new types and components to ensure functionality and reliability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Piyush Narang <[email protected]>
## Summary PR overview - [Video Link](https://drive.google.com/file/d/1Rei6upL2OiAls2jX7mCERaJGai3Noo1j/view?usp=drive_link) This PR builds on #33 and #43. We register the relevant model / join / groupby / staging query artifacts at the 'app' docker container startup by using the MetadataUploader and hitting the Dynamo endpoints using the KV store. We also extend the API to stop returning mocked data for the list and search calls and start returning real registered models + enriched responses (so the model object includes details on the Joins, GroupBys and features). There were a few broken pieces along the way that I fixed while working through the integration (e.g. the metadata walker code was missing handling models, the api.thrift enum for model type needed to start at index 0 etc). ## Checklist - [X] Added Unit Tests - [X] Covered by existing CI - [X] Integration tested - [ ] Documentation update Bringing up the container and curling: ``` $ curl http://localhost:9000/api/v1/search?term=1&limit=20 {"offset":0,"items":[{"name":"risk.transaction_model.v1","join":{"name":"risk.user_transactions.txn_join","joinFeatures":[],"groupBys":[{"name":"risk.transaction_events.txn_group_by_user","features":["transaction_amount_count_1h","transaction_amount_count_1d","transaction_amount_count_30d","transaction_amount_count_365d","transaction_amount_sum_1h"]},{"name":"risk.transaction_events.txn_group_by_merchant","features":["transaction_amount_count_1h","transaction_amount_count_1d","transaction_amount_count_30d","transaction_amount_count_365d","transaction_amount_sum_1h"]},{"name":"risk.user_data.user_group_by","features":["account_age","account_balance","credit_score","number_of_devices","country","account_type","preferred_language"]},{"name":"risk.merchant_data.merchant_group_by","features":["account_age","zipcode","is_big_merchant","country","account_type","preferred_language"]}]},"online":false,"production":false,"team":"risk","modelType":"XGBoost"}]} ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - **New Features** - Introduced a new `DynamoDBMonitoringStore` for improved model data retrieval. - Enhanced `Model` structure to include joins and groupings for better data organization. - Expanded `parseTeam` method to handle `Model` instances in metadata processing. - Updated `ModelController` and `SearchController` to utilize real data from `DynamoDBMonitoringStore`. - Introduced new methods for managing datasets in the `MetadataStore`. - Improved handling of list values in the DynamoDB key-value store implementation. - Added support for AWS services in the application through new Docker configurations. - Enhanced the application's Docker setup for better build and runtime environments. - Modified the application's metadata loading process to include a new section for DynamoDB. - **Bug Fixes** - Corrected handling of `Either` types in test cases to prevent runtime errors. - **Documentation** - Updated configuration files to support new DynamoDB module. - **Tests** - Added comprehensive unit tests for `DynamoDBKVStoreImpl` and `DynamoDBMonitoringStore`. - Enhanced test coverage for `ModelController` and `SearchController` with mocked dependencies. - Introduced new tests for the functionality of `DynamoDBMonitoringStore`. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Quick follow up to #44 since the Model.ts type changed on the backend ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated navigation links to use model names instead of IDs for improved clarity. - Removed the "Last Updated" column from the models table for a cleaner display. - **Bug Fixes** - Adjusted the model type structure to reflect the removal of outdated properties and the addition of a new property. - **Tests** - Updated test cases to align with the revised model structure, removing outdated keys and adding new ones. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Piyush Narang <[email protected]>
... to match the time column elsewhere in the codebase. ## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated the sort key definition for DynamoDB tables to use a dynamic constant, improving flexibility and maintainability. - **Bug Fixes** - Minor adjustment made in the key schema construction to enhance code clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Chewy Shaw <[email protected]>
## Summary This pr contains a notebook with code to 1. Generate a schema for a training set 2. Generate dummy data against that schema 3. Inject null rate spike and data drift anomalies 4. implement cardinality estimation 5. implement summarization logic based on high vs. low cardinalities 6. compute drift on summaries This PR also contains some book keeping changes 1. change discord to slack as the primary communication channel for the OSS chronon channel 2. Edit build scripts to remove non 2.12 builds - 2.12 is the only version that is supported across modern spark and flink. 3. pull in unit test fixes ## Checklist - [x] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Added functionality for enhanced data drift analysis and statistical distance calculations. - Implemented methods for generating and handling partition ranges and summarization in Spark. - Expanded API with new enumerations and structures for drift metrics. - Introduced new methods for generating synthetic data for fraud detection models. - Enhanced capabilities for user-defined aggregate functions (UDAFs) in Spark. - Introduced a new utility class for managing and computing missing partitions in data workflows. - **Bug Fixes** - Corrected dependency versions and improved error handling in various components. - **Documentation** - Expanded documentation on monitoring systems for ML and AI data, detailing drift computation. - **Tests** - Added comprehensive test suites for UDAFs, data summarization, and drift analysis functionalities. - Enhanced test coverage for various scenarios, including error handling and logging improvements. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Piyush Narang <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: ken-zlai <[email protected]> Co-authored-by: Chewy Shaw <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: chewy-zlai <[email protected]> Co-authored-by: ezvz <[email protected]>
## Summary old version is old AF - and won't install any more shading thrift simply wasn't enough to prevent collisions with the hive metastore path within spark. I also had to remove the serving layer and retain purely the serialization layer. There is an argument to make this a java package. That can be a TODO for later. ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated Thrift version to 0.21.0, enhancing compatibility and performance. - Added support for additional development tools and libraries for improved functionality. - Disabled Spark UI during local session execution for a cleaner testing environment. - Introduced new utility classes and methods for better data handling and serialization. - Enhanced error handling and logging in various components for improved robustness. - **Bug Fixes** - Improved file generation process with updated Thrift command to handle annotations correctly. - **Documentation** - Added comprehensive documentation for partial deserialization in the Thrift framework. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Tweak the mocked feature time series endpoints in hub to:
* Return current vs baseline when querying for the aggregate metrics
view in drift
* Include some example categorical feature datapoints with nullValues as
well
## Checklist
- [X] Added Unit Tests
- [X] Covered by existing CI
- [X] Integration tested
- [ ] Documentation update
Example APIs:
```
http://localhost:9000/api/v1/feature/my_groupby_1.my_feature_1/timeseries/slice/123?startTs=1725926400000&endTs=1726106400000&metricType=drift&metrics=null&offset=10h&algorithm=psi&granularity=percentile
<same as before, existing percentile endpoint response>
```
Categorical feature example:
```
http://localhost:9000/api/v1/feature/my_groupby_1.my_feature_1/timeseries/slice/123?startTs=1725926400000&endTs=1726106400000&metricType=drift&metrics=null&offset=10h&algorithm=psi&granularity=aggregates
{"feature":"my_groupby_1.my_feature_1","baseline":[{"value":487.0,"ts":1725926400000,"label":"A_0","nullValue":null},{"value":935.0,"ts":1725926400000,"label":"A_1","nullValue":null},{"value":676.0,"ts":1725926400000,"label":"A_2","nullValue":null},{"value":124.0,"ts":1725926400000,"label":"A_3","nullValue":null},{"value":792.0,"ts":1725926400000,"label":"A_4","nullValue":null},{"value":0.015927628269880256,"ts":1725926400000,"label":"A_{0 + 5}","nullValue":5}],"current":[{"value":487.0,"ts":1725926400000,"label":"A_0","nullValue":null},{"value":935.0,"ts":1725926400000,"label":"A_1","nullValue":null},{"value":676.0,"ts":1725926400000,"label":"A_2","nullValue":null},{"value":124.0,"ts":1725926400000,"label":"A_3","nullValue":null},{"value":792.0,"ts":1725926400000,"label":"A_4","nullValue":null},{"value":0.015927628269880256,"ts":1725926400000,"label":"A_{0 + 5}","nullValue":5},{"value":0.4864098780914311,"ts":1725926400000,"label":"A_{1 + 5}","nullValue":9}]}
```
Numeric feature example:
```
http://localhost:9000/api/v1/feature/my_groupby_1.my_feature_0/timeseries/slice/123?startTs=1725926400000&endTs=1726106400000&metricType=drift&metrics=null&offset=10h&algorithm=psi&granularity=aggregates
{"feature":"my_groupby_1.my_feature_0","baseline":[{"value":0.7101849056320707,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.574836350385667,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9464192094792073,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.039405954311386604,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4864098780914311,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4457367367074283,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6008140654988429,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.550376169584217,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6580583901495688,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9744965039734514,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6300783865329214,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.848943650191653,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.35625029673016806,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.13619253389673736,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6074814346030327,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9678613587724542,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9577601015152503,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9564654926139553,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8489499734859746,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.09680449026535276,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3709693322851644,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2620576723220621,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7840774822904888,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.009231772349260536,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8087736458178644,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.16559722400679533,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.09340002888404408,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5903294413910882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9471566138938478,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3525301310482255,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5697332656335455,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.987553287784142,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3066192974291233,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.27814078100067885,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.36546629298342326,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3359469223498933,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4133804877548868,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4538907613519919,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2730592802213989,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9723300179905568,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7982397436021529,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.10484486077360433,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9525944212075671,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.545613050676935,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.1732215015315487,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6137418293715882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5680799380469259,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.45491032084995053,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.18959532980866056,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3323071172856674,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.12330980698834149,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6027854904929627,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.02559357936621165,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.536693424891393,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8984321846068206,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5378800641051089,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6781250784576026,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5314741585722758,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.779982046151748,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6793372014523019,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2400088267353323,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.1654038845730098,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9499846312346308,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4648486801924734,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6327295636483228,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.31219889294401393,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.35383775762868575,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.944816482056446,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.23444257865170992,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.19088035186690788,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6020331949813815,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5570344284768556,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.593932205173588,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7527609831115419,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4773781346476882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5594593615808806,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9534549210159888,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7459459720931372,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.017229793426724038,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8730556437051517,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6168747291493683,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4832572289616397,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.21935776820463992,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8923607413147788,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.29422442690631245,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7857797778700744,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7804000643835558,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.15833336741295634,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.42980373846639464,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.05176758584871599,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9174804836670649,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6562104511199066,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.17815120932599882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9094355771151895,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2464066057249833,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9005122826617049,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.13070389457118314,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.45220345286338093,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8939689483748245,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9000004655461863,"ts":1725926400000,"label":null,"nullValue":null}],"current":[{"value":0.7101849056320707,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.574836350385667,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9464192094792073,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.039405954311386604,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4864098780914311,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4457367367074283,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6008140654988429,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.550376169584217,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6580583901495688,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9744965039734514,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6300783865329214,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.848943650191653,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.35625029673016806,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.13619253389673736,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6074814346030327,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9678613587724542,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9577601015152503,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9564654926139553,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8489499734859746,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.09680449026535276,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3709693322851644,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2620576723220621,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7840774822904888,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.009231772349260536,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8087736458178644,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.16559722400679533,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.09340002888404408,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5903294413910882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9471566138938478,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3525301310482255,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5697332656335455,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.987553287784142,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3066192974291233,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.27814078100067885,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.36546629298342326,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3359469223498933,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4133804877548868,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4538907613519919,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2730592802213989,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9723300179905568,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7982397436021529,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.10484486077360433,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9525944212075671,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.545613050676935,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.1732215015315487,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6137418293715882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5680799380469259,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.45491032084995053,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.18959532980866056,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.3323071172856674,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.12330980698834149,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6027854904929627,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.02559357936621165,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.536693424891393,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8984321846068206,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5378800641051089,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6781250784576026,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5314741585722758,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.779982046151748,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6793372014523019,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2400088267353323,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.1654038845730098,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9499846312346308,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4648486801924734,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6327295636483228,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.31219889294401393,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.35383775762868575,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.944816482056446,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.23444257865170992,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.19088035186690788,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6020331949813815,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5570344284768556,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.593932205173588,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7527609831115419,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4773781346476882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.5594593615808806,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9534549210159888,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7459459720931372,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.017229793426724038,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8730556437051517,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6168747291493683,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.4832572289616397,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.21935776820463992,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8923607413147788,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.29422442690631245,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7857797778700744,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.7804000643835558,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.15833336741295634,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.42980373846639464,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.05176758584871599,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9174804836670649,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.6562104511199066,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.17815120932599882,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9094355771151895,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.2464066057249833,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9005122826617049,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.13070389457118314,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.45220345286338093,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.8939689483748245,"ts":1725926400000,"label":null,"nullValue":null},{"value":0.9000004655461863,"ts":1725926400000,"label":null,"nullValue":null}]}
```
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced handling of time series data for continuous and categorical
features.
- Added functionality to generate mock data for categorical features.
- **Bug Fixes**
- Improved accuracy in generating skew metrics using updated data
structures.
- **Documentation**
- Updated test cases for clarity and specificity regarding feature types
and expected responses.
- **Refactor**
- Renamed classes and updated method signatures for consistency and
clarity in handling time series data.
- Added support for representing null values in time series data.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary As of today our spark tests CI action isn't running the right set of Spark tests. The testOnly option seems to only include and not exclude tests. To get around this, I've set up a [SuiteMixin](https://www.scalatest.org/scaladoc/3.0.6/org/scalatest/SuiteMixin.html) which we can use to run the tests in a suite if there is a tag the sbt tests have been invoked with. Else we skip them all. This allows us to: * Trigger `sbt test` or `sbt spark/test` and run all the tests barring the ones that include this suite mixin. * Selectively run these tests using an incantation like: `sbt "spark/testOnly -- -n jointest"`. This allows us to run really long running tests like the Join / Fetcher / Mutations test separately in different CI jvms in parallel to keep our build times short. There's a couple of other alternative options we can pursue to wire up our tests: * Trigger all Spark tests at once using "sbt spark/test" (this will probably bring our test runtime to ~1 hour) * Set up per test [Tags](https://www.scalatest.org/scaladoc/3.0.6/org/scalatest/Tag.html) - we could do something like either set up individual tags for the JoinTests, MutationTests, FetcherTests OR just create a "Slow" test tag and mark the Join, Mutations and Fetcher tests to it. Seems like this requires the tags to be in Java but it's a viable option. ## Checklist - [] Added Unit Tests - [X] Covered by existing CI - [ ] Integration tested - [ ] Documentation update Verified that our other Spark tests run a bunch now (and now our CI takes ~30-40 mins thanks to that :-) ): ``` [info] All tests passed. [info] Passed: Total 127, Failed 0, Errors 0, Passed 127 [success] Total time: 2040 s (34:00), completed Oct 30, 2024, 11:27:39 PM ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new `TaggedFilterSuite` trait for selective test execution based on specified tags. - Enhanced Spark test execution commands for better manageability. - **Refactor** - Transitioned multiple test classes from JUnit to ScalaTest, improving readability and consistency. - Updated test methods to utilize ScalaTest's syntax and structure. - **Bug Fixes** - Improved test logic and assertions in the `FetcherTest`, `JoinTest`, and `MutationsTest` classes to ensure expected behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced functionality to create a "drift_statistics" table in DynamoDB via command-line interface. - Added checks for required JAR files in the startup script to ensure proper initialization. - **Bug Fixes** - Updated user permissions for the DynamoDB service in the Docker configuration to enhance access control. - **Documentation** - Improved clarity in the startup script with new variable definitions for JAR file paths. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Chewy Shaw <[email protected]>
…elinux8 (#58)  ### Snyk has created this PR to fix 2 vulnerabilities in the dockerfile dependencies of this project. Keeping your Docker base image up-to-date means you’ll benefit from security fixes in the latest version of your chosen image. #### Snyk changed the following file(s): - `quickstart/Dockerfile` We recommend upgrading to `openjdk:24-ea-20-jdk-oraclelinux8`, as this image has only **15** known vulnerabilities. To do this, merge this pull request, then verify your application still works as expected. #### Vulnerabilities that will be fixed with an upgrade: | | Issue | Score | :-------------------------:|:-------------------------|:-------------------------  | Out-of-bounds Write <br/>[SNYK-DEBIAN11-GLIBC-5927133](https://snyk.io/vuln/SNYK-DEBIAN11-GLIBC-5927133) | **829**  | Out-of-bounds Write <br/>[SNYK-DEBIAN11-GLIBC-5927133](https://snyk.io/vuln/SNYK-DEBIAN11-GLIBC-5927133) | **829**  | CVE-2024-37371 <br/>[SNYK-DEBIAN11-KRB5-7411316](https://snyk.io/vuln/SNYK-DEBIAN11-KRB5-7411316) | **714**  | CVE-2024-37371 <br/>[SNYK-DEBIAN11-KRB5-7411316](https://snyk.io/vuln/SNYK-DEBIAN11-KRB5-7411316) | **714**  | CVE-2024-37371 <br/>[SNYK-DEBIAN11-KRB5-7411316](https://snyk.io/vuln/SNYK-DEBIAN11-KRB5-7411316) | **714** --- > [!IMPORTANT] > > - Check the changes in this PR to ensure they won't cause issues with your project. > - Max score is 1000. Note that the real score may have changed since the PR was raised. > - This PR was automatically created by Snyk using the credentials of a real user. > - Snyk has automatically assigned this pull request, [set who gets assigned](https://app.snyk.io/org/varant-zlai/project/df0f9e0a-e8b6-49d2-ac92-89b462d24468?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration). --- **Note:** _You are seeing this because you or someone else with access to this repository has authorized Snyk to open fix PRs._ For more information: <img src="https://api.segment.io/v1/pixel/track?data=eyJ3cml0ZUtleSI6InJyWmxZcEdHY2RyTHZsb0lYd0dUcVg4WkFRTnNCOUEwIiwiYW5vbnltb3VzSWQiOiJhMWE1ZWU0OS0zZGZmLTQ1MDgtYTk1Mi1kZmFmNDczMGY1YjYiLCJldmVudCI6IlBSIHZpZXdlZCIsInByb3BlcnRpZXMiOnsicHJJZCI6ImExYTVlZTQ5LTNkZmYtNDUwOC1hOTUyLWRmYWY0NzMwZjViNiJ9fQ==" width="0" height="0"/> 🧐 [View latest project report](https://app.snyk.io/org/varant-zlai/project/df0f9e0a-e8b6-49d2-ac92-89b462d24468?utm_source=github&utm_medium=referral&page=fix-pr) 👩💻 [Set who automatically gets assigned](https://app.snyk.io/org/varant-zlai/project/df0f9e0a-e8b6-49d2-ac92-89b462d24468?utm_source=github&utm_medium=referral&page=fix-pr/settings/integration) 📜 [Customise PR templates](https://docs.snyk.io/scan-using-snyk/pull-requests/snyk-fix-pull-or-merge-requests/customize-pr-templates) 🛠 [Adjust project settings](https://app.snyk.io/org/varant-zlai/project/df0f9e0a-e8b6-49d2-ac92-89b462d24468?utm_source=github&utm_medium=referral&page=fix-pr/settings) 📚 [Read about Snyk's upgrade logic](https://support.snyk.io/hc/en-us/articles/360003891078-Snyk-patches-to-fix-vulnerabilities) --- **Learn how to fix vulnerabilities with free interactive lessons:** 🦉 [Learn about vulnerability in an interactive lesson of Snyk Learn.](https://learn.snyk.io/?loc=fix-pr) [//]: # 'snyk:metadata:{"customTemplate":{"variablesUsed":[],"fieldsUsed":[]},"dependencies":[{"name":"openjdk","from":"8-jre-slim","to":"24-ea-20-jdk-oraclelinux8"}],"env":"prod","issuesToFix":[{"exploit_maturity":"Mature","id":"SNYK-DEBIAN11-GLIBC-5927133","priority_score":829,"priority_score_factors":[{"type":"exploit","label":"High","score":214},{"type":"fixability","label":true,"score":214},{"type":"severity","label":"high","score":400},{"type":"scoreVersion","label":"v1","score":1}],"severity":"high","title":"Out-of-bounds Write"},{"exploit_maturity":"Mature","id":"SNYK-DEBIAN11-GLIBC-5927133","priority_score":829,"priority_score_factors":[{"type":"exploit","label":"High","score":214},{"type":"fixability","label":true,"score":214},{"type":"severity","label":"high","score":400},{"type":"scoreVersion","label":"v1","score":1}],"severity":"high","title":"Out-of-bounds Write"},{"exploit_maturity":"No Known Exploit","id":"SNYK-DEBIAN11-KRB5-7411316","priority_score":714,"priority_score_factors":[{"type":"fixability","label":true,"score":214},{"type":"severity","label":"critical","score":500},{"type":"scoreVersion","label":"v1","score":1}],"severity":"critical","title":"CVE-2024-37371"},{"exploit_maturity":"No Known Exploit","id":"SNYK-DEBIAN11-KRB5-7411316","priority_score":714,"priority_score_factors":[{"type":"fixability","label":true,"score":214},{"type":"severity","label":"critical","score":500},{"type":"scoreVersion","label":"v1","score":1}],"severity":"critical","title":"CVE-2024-37371"},{"exploit_maturity":"No Known Exploit","id":"SNYK-DEBIAN11-KRB5-7411316","priority_score":714,"priority_score_factors":[{"type":"fixability","label":true,"score":214},{"type":"severity","label":"critical","score":500},{"type":"scoreVersion","label":"v1","score":1}],"severity":"critical","title":"CVE-2024-37371"}],"prId":"a1a5ee49-3dff-4508-a952-dfaf4730f5b6","prPublicId":"a1a5ee49-3dff-4508-a952-dfaf4730f5b6","packageManager":"dockerfile","priorityScoreList":[829,714],"projectPublicId":"df0f9e0a-e8b6-49d2-ac92-89b462d24468","projectUrl":"https://app.snyk.io/org/varant-zlai/project/df0f9e0a-e8b6-49d2-ac92-89b462d24468?utm_source=github&utm_medium=referral&page=fix-pr","prType":"fix","templateFieldSources":{"branchName":"default","commitMessage":"default","description":"default","title":"default"},"templateVariants":["updated-fix-title","priorityScore"],"type":"user-initiated","upgrade":["SNYK-DEBIAN11-GLIBC-5927133","SNYK-DEBIAN11-GLIBC-5927133","SNYK-DEBIAN11-KRB5-7411316","SNYK-DEBIAN11-KRB5-7411316","SNYK-DEBIAN11-KRB5-7411316"],"vulns":["SNYK-DEBIAN11-GLIBC-5927133","SNYK-DEBIAN11-KRB5-7411316"],"patch":[],"isBreakingChange":false,"remediationStrategy":"vuln"}' --------- Co-authored-by: snyk-bot <[email protected]>
## Summary **Issue**: The current `.gitignore` patterns are not effectively ignoring `.DS_Store` files across the project. - `**/.DS_Store/`: The trailing slash treats `.DS_Store` as a directory, so it doesn’t match `.DS_Store` files as expected. - `/frontend/.DS_Store`: This only ignores `.DS_Store` files in the root of the `frontend` directory, not in its subdirectories. **Solution**: Using `**/.DS_Store` will correctly ignore `.DS_Store` files in all directories throughout the project, resolving both issues. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated the `.gitignore` file to better reflect project structure by adding and removing entries for ignored files and directories. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Lots of changes here. Here is a [PR walkthrough video](https://www.loom.com/share/4f72f1a1a22742e2ad7135ea0aaeec9c) and a summary: - Custom chart zoom functionality (click and drag) - Sticky header w/ controls shows when you show past the model drift chart section - Use geist font throughout app (and monospace version for numbers) - Charts should fill full width on resize of window - Start to match UI/typography - Sidebar w/ charts using local sample data (percentiles, now vs baseline) - Date range selector w/ calendar picker ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced a customizable date range selection component. - Added new components: `DriftSkewToggle`, `DateRangeSelector`, `PageHeader`, and various calendar-related components. - Enhanced navigation with icons for improved user experience. - Implemented a reset zoom button for chart interactions. - **Improvements** - Updated typography with new font options and sizes. - Enhanced chart rendering logic for better data visualization. - Improved layout and structure for better usability in observability pages. - **Bug Fixes** - Adjusted padding and layout for table and separator components. - Fixed issues with date range handling in observability features. - **Documentation** - Updated internal documentation to reflect new components and features. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Piyush Narang <[email protected]>
## Summary Creates a Summary Uploader which uploads summary data to a KVStore. ## Checklist - [ x ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced `SummaryUploader` for uploading summary statistics to a key-value store. - Added `MockKVStore` for testing key-value store operations. - Implemented `KVStoreSemaphore` for managing concurrent access to resources. - **Enhancements** - Increased data volume in tests to improve testing scenarios. - Integrated `SummaryUploader` in `DriftTest` for uploading summary data during tests. - Enhanced control over concurrent reads and writes to DynamoDB with updated `DynamoDBKVStoreImpl`. - Refined error handling and flow in `multiPut` operations for better robustness. - Updated Spark dependency from `3.5.0` to `3.5.1` for improved stability. - Added a new constant `DriftStatsTable` for drift statistics. - **Bug Fixes** - Improved error handling for upload failures in `SummaryUploader`. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: nikhil-zlai <[email protected]> Co-authored-by: Chewy Shaw <[email protected]> Co-authored-by: Piyush Narang <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: ken-zlai <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: ezvz <[email protected]>
## Summary This completes the path of generating data, summarizing it, and uploading it to DynamoDB. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - **New Features** - Enhanced data generation with updated schema for fraud data, including a new date string field. - Added functionality to load summary data into DynamoDB with improved error handling. - Introduced summarization and upload capabilities for data processing. - Updated Docker configuration to include new environment variables for Spark and AWS integration. - **Bug Fixes** - Improved error handling during summary data uploads to ensure better reliability. - **Refactor** - Updated variable handling in scripts for better clarity and maintainability. - Refactored test cases to utilize a mock API for uploading summary statistics. - **Documentation** - Comments and documentation have been updated to reflect changes in data structures and functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: nikhil-zlai <[email protected]> Co-authored-by: Chewy Shaw <[email protected]> Co-authored-by: Piyush Narang <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: ken-zlai <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: ezvz <[email protected]>
…nd_yarn group across 1 directory (#66) Bumps the npm_and_yarn group with 1 update in the /frontend directory: [@eslint/plugin-kit](https://github.com/eslint/rewrite). Updates `@eslint/plugin-kit` from 0.2.0 to 0.2.3 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/eslint/rewrite/releases"><code>@eslint/plugin-kit</code>'s releases</a>.</em></p> <blockquote> <h2>plugin-kit: v0.2.3</h2> <h2><a href="https://github.com/eslint/rewrite/compare/plugin-kit-v0.2.2...plugin-kit-v0.2.3">0.2.3</a> (2024-11-14)</h2> <h3>Dependencies</h3> <ul> <li>The following workspace dependencies were updated <ul> <li>devDependencies <ul> <li><code>@eslint/core</code> bumped from ^0.8.0 to ^0.9.0</li> </ul> </li> </ul> </li> </ul> <h2>plugin-kit: v0.2.2</h2> <h2><a href="https://github.com/eslint/rewrite/compare/plugin-kit-v0.2.1...plugin-kit-v0.2.2">0.2.2</a> (2024-10-25)</h2> <h3>Dependencies</h3> <ul> <li>The following workspace dependencies were updated <ul> <li>devDependencies <ul> <li><code>@eslint/core</code> bumped from ^0.7.0 to ^0.8.0</li> </ul> </li> </ul> </li> </ul> <h2>plugin-kit: v0.2.1</h2> <h2><a href="https://github.com/eslint/rewrite/compare/plugin-kit-v0.2.0...plugin-kit-v0.2.1">0.2.1</a> (2024-10-18)</h2> <h3>Dependencies</h3> <ul> <li>The following workspace dependencies were updated <ul> <li>devDependencies <ul> <li><code>@eslint/core</code> bumped from ^0.6.0 to ^0.7.0</li> </ul> </li> </ul> </li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/eslint/rewrite/commit/a957ee351c27ac1bf22966768cf8aac8c12ce0d2"><code>a957ee3</code></a> chore: release main (<a href="https://github.com/eslint/rewrite/issues/130">#130</a>)</li> <li><a href="https://github.com/eslint/rewrite/commit/3591a7805a060cb130d40d61f200431b782431d8"><code>3591a78</code></a> feat: Add Language#normalizeLanguageOptions() (<a href="https://github.com/eslint/rewrite/issues/131">#131</a>)</li> <li><a href="https://github.com/eslint/rewrite/commit/2fa68b7150561c48821206272ba7d0440e7e7f15"><code>2fa68b7</code></a> chore: fix formatting error (<a href="https://github.com/eslint/rewrite/issues/133">#133</a>)</li> <li><a href="https://github.com/eslint/rewrite/commit/071be842f0bd58de4863cdf2ab86d60f49912abf"><code>071be84</code></a> Merge commit from fork</li> <li><a href="https://github.com/eslint/rewrite/commit/e73b1dc40fef68819969fdbe9060a47dcc4cae1b"><code>e73b1dc</code></a> docs: Update README sponsors</li> <li><a href="https://github.com/eslint/rewrite/commit/d0b2e705c49709cfb92a9110c65cd628c91aaa29"><code>d0b2e70</code></a> fix: non-optional properties in generic interfaces (<a href="https://github.com/eslint/rewrite/issues/132">#132</a>)</li> <li><a href="https://github.com/eslint/rewrite/commit/3a87bbb7f0b501c74507f32083c289304d6c03a6"><code>3a87bbb</code></a> fix: Support legacy <code>schema</code> properties (<a href="https://github.com/eslint/rewrite/issues/128">#128</a>)</li> <li><a href="https://github.com/eslint/rewrite/commit/c24083b7ef46958114c19cac669108fa3bd1646e"><code>c24083b</code></a> docs: Update README sponsors</li> <li><a href="https://github.com/eslint/rewrite/commit/0dc78d335a98ef680b579851026438473147750e"><code>0dc78d3</code></a> chore: release main (<a href="https://github.com/eslint/rewrite/issues/125">#125</a>)</li> <li><a href="https://github.com/eslint/rewrite/commit/ffa176f0c80c14c8ba088d2ba359af4b2805c4f5"><code>ffa176f</code></a> feat: Add rule types (<a href="https://github.com/eslint/rewrite/issues/110">#110</a>)</li> <li>Additional commits viewable in <a href="https://github.com/eslint/rewrite/compare/core-v0.2.0...plugin-kit-v0.2.3">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/zipline-ai/chronon/network/alerts). </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a method for converting objects to a pretty-printed JSON string format. - Added functionality for calculating drift metrics between `TileSummary` instances. - Enhanced drift analysis capabilities with new metrics and structures. - New endpoints for model prediction and model drift in the API. - Introduced utility functions for transforming and aggregating data related to `TileSummary` and `TileDrift`. - Enhanced metadata handling with new constants and improved dataset references. - Added a method for processing percentiles and breakpoints to generate interval assignments. - **Bug Fixes** - Improved error handling in various methods for better clarity and logging. - **Refactor** - Renamed variables and methods for clarity and consistency. - Updated method signatures to accommodate new features and improve usability. - Consolidated import statements for better organization. - Removed deprecated objects and methods to streamline functionality. - **Tests** - Added comprehensive unit tests for drift metrics and pivot functionality. - Enhanced test coverage for new and modified features. - Removed outdated tests and added new tests for handling key mappings in joins. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
… elastic search. (#65) ## Summary Adds a temporal service with temporal admin tools, a temporal ui, and elastic search to the Docker PoC setup. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced new services: MySQL, Temporal, Temporal Admin Tools, and Temporal UI. - Added a new network, Temporal Network, to enhance service communication. - **Changes** - Adjusted port mapping for the Spark service to improve accessibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - https://app.asana.com/0/1208785567265389/1208812512114700 - This PR addresses some flaky unit test behavior that we've been observing in the zipline fork. See: https://zipline-2kh4520.slack.com/archives/C072LUA50KA/p1732043073171339?thread_ts=1732042778.209419&cid=C072LUA50KA - A previous [CI test](https://github.com/zipline-ai/chronon/actions/runs/11946764068/job/33301642119?pr=72 ) shows that `other_spark_tests` intermittently fails due to a couple reasons. This PR addresses the flakiness of [FeatureWithLabelJoinTest .testFinalViewsWithAggLabel]( https://github.com/zipline-ai/chronon/blob/6cb6273551e024d6eecb068f754b510ae0aac464/spark/src/test/scala/ai/chronon/spark/test/FeatureWithLabelJoinTest.scala#L118), where sometimes the test assertion fails with an unexpected result value. ### Synopsis Looks like during a rewrite/refactoring of the code, we did not preserve the functionality. The diff starts to happen at the time of computing label joins per partition range, in particular when we materialize the label join and [scan it back](https://github.com/zipline-ai/chronon/blob/b64f44d57c90367ccfcb5d5c96327a1ef820e2b3/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L200). In the OSS version, the [scan](https://github.com/airbnb/chronon/blob/6968c5c29b6e48867f8c08f2b9b8281f09d47c16/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L192-L193) applies a [partition filter](https://github.com/airbnb/chronon/blob/6968c5c29b6e48867f8c08f2b9b8281f09d47c16/spark/src/main/scala/ai/chronon/spark/DataRange.scala#L102-L104). We dropped these partition filters during the [refactoring](c6a377c#diff-57b1d6132977475fa0e87a71f017e66f4a7c94f466f911b33e9178598c6c058dL97-R102) on Zipline side. As such, the physical plans produced by these two scans are different: ``` // Zipline == Physical Plan == *(1) ColumnarToRow +- FileScan parquet spark_catalog.final_join.label_agg_table_listing_labels_agg[listing#53934L,is_active_max_5d#53935,label_ds#53936] Batched: true, DataFilters: [], Format: Parquet, Location: CatalogFileIndex(1 paths)[file:/tmp/chronon/spark-warehouse_6fcd3d/data/final_join.db/label_agg_t..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<listing:bigint,is_active_max_5d:int> ``` ``` // OSS == Physical Plan == Coalesce 1000 +- *(1) ColumnarToRow +- FileScan parquet final_join_xggqlu.label_agg_table_listing_labels_agg[listing#50981L,is_active_max_5d#50982,label_ds#50983] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/tmp/chronon/spark-warehouse_69002f/data/final_join_xggqlu.db/label_agg_ta..., PartitionFilters: [isnotnull(label_ds#50983), (label_ds#50983 >= 2022-10-07), (label_ds#50983 <= 2022-10-07)], PushedFilters: [], ReadSchema: struct<listing:bigint,is_active_max_5d:int> ``` Note that OSS has a non-empty partition filter: `PartitionFilters: [isnotnull(label_ds#50983), (label_ds#50983 >= 2022-10-07), (label_ds#50983 <= 2022-10-07)]` where Zipline does not. The fix is to add these partition filters back, as done in this PR. ~### Abandoned Investigation~ ~It looks like there is some non-determinism computing one of the intermittent dataframes when computing label joins. [`dropDuplicates`](https://github.com/zipline-ai/chronon/blob/6cb6273551e024d6eecb068f754b510ae0aac464/spark/src/main/scala/ai/chronon/spark/LabelJoin.scala#L215) seems to be operating on a row compound key `rowIdentifier`, which doesn't produce deterministic results. As such we sometimes lose the expected values. This [change](https://github.com/airbnb/chronon/pull/380/files#diff-2c74cac973e1af38b615f654fee5b0261594a2b0005ecfd5a8f0941b8e348eedR156) was introduced in OSS upstream almost 2 years ago. This [test](airbnb/chronon#435) was contributed a couple months after .~ ~See debugger local values comparison. The left side is test failure, and right side is test success.~ ~<img width="1074" alt="Screenshot 2024-11-21 at 9 26 04 AM" src="https://github.com/user-attachments/assets/0eba555c-43ab-48a6-bf61-bbb7b4fa2445">~ ~Removing the `dropDuplicates` call will allow the tests to pass. However, unclear if this will produce the semantically correct behavior, as the tests themselves seem~ ## Checklist - [x] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Reintroduced a testing method to validate label joins, ensuring accuracy in data processing. - **Improvements** - Enhanced data retrieval logic for label joins, emphasizing unique entries and clearer range specifications. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Load local resources irrespective of where the tests are currently being run from. This allows us to run them from Intellij. - ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [x] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved test robustness by replacing hardcoded file paths with dynamic resource URI retrieval for loading test data. - **Tests** - Enhanced flexibility in test cases for locating resources, ensuring consistent access regardless of the working directory. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
## Summary Switches to create-summary-dataset, and provides conf-path to summarize-and-upload ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated command for creating summary datasets to improve clarity and functionality. - Enhanced configuration handling for summary data uploads. - **Bug Fixes** - Maintained consistent error handling to ensure reliability during execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - Some of the `logger.info` invocations weren't happening. That's mainly because we don't have a logger implementation specified at least in the spark build. This PR adds a `logback` implementation and a basic configuration as such. - This will allow us to see log messages through the command line. I tested this: `sbt "testOnly ai.chronon.spark.test.FeatureWithLabelJoinTest" | grep "== Features DF =="` - I also verified that the dep tree shows the new logger deps were present: ``` sbt spark/dependencyTree [info] welcome to sbt 1.8.2 (Oracle Corporation Java 17.0.2) [info] loading settings for project chronon-build from plugins.sbt ... [info] loading project definition from /Users/thomaschow/zipline-ai/chronon/project [info] loading settings for project root from build.sbt,version.sbt ... [info] resolving key references (13698 settings) ... [info] set current project to chronon (in build file:/Users/thomaschow/zipline-ai/chronon/) [info] spark:spark_2.12:0.1.0-SNAPSHOT [S] [info] +-aggregator:aggregator_2.12:0.1.0-SNAPSHOT [S] [info] | +-api:api_2.12:0.1.0-SNAPSHOT [S] [info] | | +-org.scala-lang.modules:scala-collection-compat_2.12:2.11.0 [S] [info] | | +-org.scala-lang:scala-reflect:2.12.18 [S] [info] | | [info] | +-com.google.code.gson:gson:2.10.1 [info] | +-org.apache.datasketches:datasketches-java:6.1.0 [info] | +-org.apache.datasketches:datasketches-memory:3.0.1 [info] | [info] +-ch.qos.logback:logback-classic:1.2.11 [info] | +-ch.qos.logback:logback-core:1.2.11 [info] | +-org.slf4j:slf4j-api:1.7.36 [info] | [info] +-com.google.guava:guava:33.3.1-jre [info] | +-com.google.code.findbugs:jsr305:3.0.2 [info] | +-com.google.errorprone:error_prone_annotations:2.28.0 [info] | +-com.google.guava:failureaccess:1.0.2 [info] | +-com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-gu.. [info] | +-com.google.j2objc:j2objc-annotations:3.0.0 [info] | +-org.checkerframework:checker-qual:3.43.0 [info] | [info] +-jakarta.servlet:jakarta.servlet-api:4.0.3 [info] +-online:online_2.12:0.1.0-SNAPSHOT [S] [info] +-aggregator:aggregator_2.12:0.1.0-SNAPSHOT [S] [info] | +-api:api_2.12:0.1.0-SNAPSHOT [S] [info] | | +-org.scala-lang.modules:scala-collection-compat_2.12:2.11.0 [S] [info] | | +-org.scala-lang:scala-reflect:2.12.18 [S] [info] | | [info] | +-com.google.code.gson:gson:2.10.1 [info] | +-org.apache.datasketches:datasketches-java:6.1.0 [info] | +-org.apache.datasketches:datasketches-memory:3.0.1 [info] | [info] +-com.datadoghq:java-dogstatsd-client:4.4.1 [info] | +-com.github.jnr:jnr-unixsocket:0.36 [info] | +-com.github.jnr:jnr-constants:0.9.17 [info] | +-com.github.jnr:jnr-enxio:0.30 [info] | | +-com.github.jnr:jnr-constants:0.9.17 [info] | | +-com.github.jnr:jnr-ffi:2.1.16 [info] | | +-com.github.jnr:jffi:1.2.23 [info] | | +-com.github.jnr:jnr-a64asm:1.0.0 [info] | | +-com.github.jnr:jnr-x86asm:1.0.2 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-commons:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-util:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-com.github.jnr:jnr-ffi:2.1.16 [info] | | +-com.github.jnr:jffi:1.2.23 [info] | | +-com.github.jnr:jnr-a64asm:1.0.0 [info] | | +-com.github.jnr:jnr-x86asm:1.0.2 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-commons:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-util:7.1 [info] | | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | | +-org.ow2.asm:asm:7.1 [info] | | | | [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-com.github.jnr:jnr-posix:3.0.61 [info] | +-com.github.jnr:jnr-constants:0.9.17 [info] | +-com.github.jnr:jnr-ffi:2.1.16 [info] | +-com.github.jnr:jffi:1.2.23 [info] | +-com.github.jnr:jnr-a64asm:1.0.0 [info] | +-com.github.jnr:jnr-x86asm:1.0.2 [info] | +-org.ow2.asm:asm-analysis:7.1 [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm-commons:7.1 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm-tree:7.1 [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm-util:7.1 [info] | | +-org.ow2.asm:asm-analysis:7.1 [info] | | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm-tree:7.1 [info] | | | +-org.ow2.asm:asm:7.1 [info] | | | [info] | | +-org.ow2.asm:asm:7.1 [info] | | [info] | +-org.ow2.asm:asm:7.1 [info] | [info] +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] +-com.fasterxml.jackson.core:jackson-databind:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] | [info] +-com.fasterxml.jackson.module:jackson-module-scala_2.12:2.15.2 [S] [info] | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] | +-com.fasterxml.jackson.core:jackson-databind:2.15.2 [info] | | +-com.fasterxml.jackson.core:jackson-annotations:2.15.2 [info] | | +-com.fasterxml.jackson.core:jackson-core:2.15.2 [info] | | [info] | +-com.thoughtworks.paranamer:paranamer:2.8 [info] | [info] +-com.github.ben-manes.caffeine:caffeine:3.1.8 [info] | +-com.google.errorprone:error_prone_annotations:2.21.1 (evicted by: 2.28.. [info] | +-com.google.errorprone:error_prone_annotations:2.28.0 [info] | +-org.checkerframework:checker-qual:3.37.0 (evicted by: 3.43.0) [info] | +-org.checkerframework:checker-qual:3.43.0 [info] | [info] +-net.jodah:typetools:0.6.3 [info] +-org.rogach:scallop_2.12:5.1.0 [S] [info] +-org.scala-lang.modules:scala-java8-compat_2.12:1.0.2 [S] ``` - Additional steps are required for Intellij to behave the same way. I needed to configure the classpath `-cp chronon.spark` in the run configuration: <img width="953" alt="Screenshot 2024-11-21 at 3 34 50 PM" src="https://github.com/user-attachments/assets/aebbc466-a207-43d0-9f6f-a9bfa811eb66"> and same for `ScalaTest` . I updated the local setup to reflect this: https://docs.google.com/document/d/1k9_aQ3tkW5wvzKyXSsWWPK6HZxX4t8zVPVThODpZqQs/edit?tab=t.0#heading=h.en6opahtqp7u ## Checklist - [x] Added Unit Tests - [x] Covered by existing CI - [x] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new logging configuration with a `logback.xml` file for enhanced logging capabilities. - Added support for overriding dependencies in the build configuration for improved dependency management. - **Bug Fixes** - Ensured consistent logging library versions across the project to avoid potential conflicts. - **Chores** - Streamlined dependency declarations for better organization within the build configuration. - Improved logging feedback during the build process. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Some updates to get our Flink jobs running on the Etsy side: * Configure schema registry via host/port/scheme instead of URL * Explicitly set the task slots per task manager * Configure checkpoint directory based on teams.json ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - Kicked off the job on the Etsy cluster and confirmed it's up and running - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced an enhanced configuration option for Flink job submissions by adding a state URI parameter for improved job state management. - Expanded schema registry configuration, enabling greater flexibility with host, port, and scheme settings. - **Chores** - Adjusted logging levels and refined error messaging to support better troubleshooting. - **Documentation** - Updated configuration guidance to aid in setting up schema registry integration. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
- This migrates over our artifact upload + run.py to leverage
bazel-built jars. Only for the batch side for now, streaming will
follow.
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Updated default JAR file names for online and Dataproc submissions.
- Migrated build process from `sbt` to `bazel` for GCP artifact
generation.
- Added new `submitter` binary target for Dataproc submission.
- Added dependency for Scala-specific features of the Jackson library in
the online library.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: Thomas Chow <[email protected]>
## Summary
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Streamlined the build process with more consistent naming conventions
for cloud deployment artifacts.
- **New Features**
- Enhanced support for macOS environments by introducing
platform-specific handling during the build, ensuring improved
compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Co-authored-by: Thomas Chow <[email protected]>
## Summary To add missing dependencies for flink module coming from our recent changes to keep it in sync with sbt Tested locally by running Flink jobs from DataprocSubmitterTest ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Expanded Kafka integration with enhanced authentication, client functionality, and Protobuf serialization support. - Improved JSON processing support for Scala-based operations. - Adjusted dependency versions to ensure better compatibility and stability with Kafka and cloud services. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Newer version with better ergonomics, and more maintainable codebase - smaller files, simpler logic - with little indirection etc with user facing simplification - [x] always compile the whole repo - [x] teams thrift and teams python - [x] compile context as its own object - [ ] integration test on sample - [ ] progress bar for compile - [ ] sync to remote w/ progress bar - [ ] display workflow - [ ] display workflow progress ## Checklist - [x] Added Unit Tests - [ ] Covered by existing CI - [x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New CLI Commands** - Introduced streamlined commands for synchronizing, backfilling, and deploying operations for easier management. - **Enhanced Logging** - Improved colored, structured log outputs for clearer real-time monitoring and debugging. - **Configuration & Validation Upgrades** - Strengthened configuration management and validation processes to ensure reliable operations. - Added a comprehensive validation framework for Chronon API thrift objects. - **Build & Infrastructure Improvements** - Transitioned to a new container base and modernized testing/build systems for better performance and stability. - **Team & Utility Enhancements** - Expanded team configuration options and refined utility processes to streamline overall workflows. - Introduced new data classes and methods for improved configuration and compilation context management. - Enhanced templating functionality for dynamic replacements based on object properties. - Improved handling of Git operations and enhanced error logging for better traceability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Ken Morton <[email protected]> Co-authored-by: Kumar Teja Chippala <[email protected]> Co-authored-by: Kumar Teja Chippala <[email protected]>
## Summary Update the Flink job code on the tiling path to use the TileKey. I haven't wired up the KV store side of things yet (can do the write and read side of the KV store collaboratively with Thomas as they need to go together to keep the tests happy). The tiling version of the Flink job isn't in use so these changes should be safe to go and keeps things incremental. ## Checklist - [ ] Added Unit Tests - [X] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added a utility to determine the start timestamp for a defined time window. - **Refactor/Enhancements** - Streamlined time window handling by providing a default one-day resolution when none is specified. - Improved tiled data processing with consistent tiling window sizing and enriched metadata management. - **Tests** - Updated integration tests to validate the new tile processing and time window behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Updated dev notes with instructions for Bazel setup and some useful commands. Also updated bazel target names so the intemediate/uber jar names don't conflict ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Refactor** - Streamlined naming conventions and updated dependency references across multiple project modules for improved consistency. - **Documentation** - Expanded build documentation with a new, comprehensive Bazel Setup section detailing configuration, caching, testing, and deployment instructions. - **Build System Enhancements** - Introduced updated source-generation rules to support future multi-language integration and more robust build workflows. - **Workflow Updates** - Modified test target names in CI workflows to reflect updated naming conventions, enhancing clarity and consistency in test execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
… fix (#322) ## Summary Updated dev notes with Bazel installation instructions and java error fix ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Enhanced setup guidance for Bazel installation on both Mac and Linux. - Provided clear instructions for resolving Java-related issues on Mac. - Updated testing procedures by replacing previous instructions with streamlined Bazel commands. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Increased timeout and java heap size for spark tests to avoid flaky test failures ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Tests** - Extended test timeout settings to 900 seconds for enhanced testing robustness. - Updated job names and workflow references for better clarity and consistency in testing workflows. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Add jvm_binary targets for service and hub modules to build final assembly jars for deployment ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update
## Summary ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Updated the dependency supporting Apache Spark functionality to boost backend data processing efficiency. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Remove flink streaming scala dependency as we no longer need it otherwise we will run into runtime error saying flink-shaded-guava package not found ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved dependency handling to reduce the risk of runtime errors such as class incompatibilities. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Solves sync failures ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Chores** - Streamlined the Java build configuration by removing legacy integration and testing support. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Hit some errors as our Spark deps pull in rocksdbjni 8.3.2 whereas we expect an older version in Flink (6.20.3-ververica-2.0). As we rely on user class first it seems like this newer version gets priority and when Flink is closing tiles we hit an error - ``` 2025-02-05 21:14:53,614 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Tiling for etsy.listing_canary.actions_v1 -> (Tiling Side Output Late Data for etsy.listing_canary.actions_v1, Avro conversion for etsy.listing_canary.actions_v1 -> async kvstore writes for etsy.listing_canary.actions_v1 -> Sink: Metrics Sink for etsy.listing_canary.actions_v1) (2/3) (a107444db4dad3eb79d9d02631d8696e_5627cd3c4e8c9c02fa4f114c4b3607f4_1_56) switched from RUNNING to FAILED on container_1738197659103_0039_01_000004 @ zipline-canary-cluster-w-1.us-central1-c.c.canary-443022.internal (dataPort=33465). java.lang.NoSuchMethodError: 'void org.rocksdb.WriteBatch.remove(org.rocksdb.ColumnFamilyHandle, byte[])' at org.apache.flink.contrib.streaming.state.RocksDBWriteBatchWrapper.remove(RocksDBWriteBatchWrapper.java:105) ``` Yanked it out from the two jars and confirmed that the Flink job seems to be running fine + crossing over across hours (and hence tile closures). ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Made internal adjustments to dependency management to improve compatibility between libraries and enhance overall application stability. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Reverts #330 It's breaking our build as we are using `java_test_suite` for service_commons module <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced configuration to support improved Java testing capabilities. - Expanded build system functionality with additional integrations for JVM-based test suites. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Some of the benefits using LayerChart over echarts/uplot - Broad support of chart types (cartesian, polar, hierarchy, graph, force, geo) - Simplicity in setup and customization (composable chart components) - Responsive charts, both for viewport/container, and also theming (light/dark, etc) - Flexibility in design/stying (CSS variables, classes, color scales, etc) including transitions - Ability to opt into canvas or svg rendering context as the use case requires. LayerChart's canvas support also has CSS variable/styling support as well (which is unique as far as I'm aware). Html layers are also available, which are great for multiline text (with truncation). ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced new interactive charts, including `FeaturesLineChart` and `PercentileLineChart`, enhancing data visualization with detailed tooltips. - **Refactor** - Replaced legacy ECharts components with LayerChart components, streamlining chart interactions and state management. - **Chores** - Updated dependency configurations and Tailwind CSS settings for improved styling and performance. - Removed unused ECharts-related components and functions to simplify the codebase. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --- - To see the specific tasks where the Asana app for GitHub is being used, see below: - https://app.asana.com/0/0/1209163154826936 <!-- av pr metadata This information is embedded by the av CLI when creating PRs to track the status of stacks when using Aviator. Please do not delete or edit this section of the PR. ``` {"parent":"main","parentHead":"","trunk":"main"} ``` --> --------- Co-authored-by: Sean Lynch <[email protected]>
## Summary Allows connecting to `app` docker container on `localhost:5005` via IntelliJ. See [tutorial](https://www.jetbrains.com/help/idea/tutorial-remote-debug.html) for more details, but the summary is - Recreate the docker containers (`docker-init/build.sh --all`) - This will pass the `-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:5005` CLI arguments and expose `5005` on the host (i.e. your laptop) - You will see `Listening for transport dt_socket at address: 5005` in the container logs  - In IntelliJ, add a new `Remote JVM Debug` config with default settings  - Set a breakpoint and then trigger it (from the frontend or calling endpoints directly)  This has been useful to understand the [java.lang.NullPointerException](https://app.asana.com/0/home/1208932362205799/1209321714844239) error when fetching summaries for some columns/features (ex. `dim_merchant_account_type`, `dim_merchant_country`)  ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- av pr metadata This information is embedded by the av CLI when creating PRs to track the status of stacks when using Aviator. Please do not delete or edit this section of the PR. ``` {"parent":"main","parentHead":"","trunk":"main"} ``` --> Co-authored-by: Sean Lynch <[email protected]>
## Summary Added a new workflow to verify the bazel config setup. This essentially validates all our bazel config by pulling all the necessary dependencies for all targets with out actually building them which is similar to the `Sync project` option in IntelliJ. This should help us capture errors in our bazel config setup for the CI. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Added a new automated CI workflow to run configuration tests with improved concurrency management. - Expanded CI triggers to include additional modules for broader testing coverage. - Tests - Temporarily disabled a dynamic class loading test in the cloud integrations to improve overall test stability. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary The codebase previously operated under the assumption that partition listing is a cheap operation. It is not the case for GCS Format - partition listing is expensive on GCS external tables. ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved error handling to provide clearer messaging when data retrieval issues occur. - **Refactor** - Streamlined internal processing for data partitions, resulting in more consistent behavior. - Enhanced logging during data scanning for easier troubleshooting. - Simplified logic for handling intersected ranges, ensuring consistent definitions. - Reduced the volume of test data for improved performance and resource utilization during tests. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary the default column reader batch size is 4096 - reads that many rows into memory buffer at once. that causes ooms on large columns, for catalyst we only need to read one row at a time. for interactive we set the limit to 16. tested on etsy data. ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced data processing performance by adding an optimized configuration for reading Parquet files in Spark sessions. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary regarding https://app.asana.com/0/1208277377735902/1209321714844239 No longer are nulls put into the histogram array, instead we use the `Constants.magicNullLong`. We will have to filter this out from the charts as empty value in the FE. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced an additional summary retrieval for merchant account types, providing enhanced insights. - **Bug Fixes** - Improved data processing reliability by substituting missing values with default placeholders, ensuring more consistent results. - Enhanced handling of null values in histogram data for more accurate representation. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
## Checklist
- [ ] Added Unit Tests
- [ ] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Documentation**
- Added a new section for connecting remotely to a Java process in
Docker using IntelliJ's remote debugging feature.
- Clarified commit message formatting instructions for better
consistency.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- av pr metadata
This information is embedded by the av CLI when creating PRs to track
the status of stacks when using Aviator. Please do not delete or edit
this section of the PR.
```
{"parent":"main","parentHead":"","trunk":"main"}
```
-->
---------
Co-authored-by: Sean Lynch <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
service_commons/src/main/java/ai/chronon/service/RouteHandlerWrapper.java (2)
240-262: 🛠️ Refactor suggestionAdd input validation and field limit.
Previous null check concern still applies.
Add field limit to prevent excessive processing:
public static Map<String, String> convertPojoToMap(Object pojo) { + if (pojo == null) { + throw new IllegalArgumentException("Input POJO cannot be null"); + } Map<String, String> result = new HashMap<>(); Class<?> pojoClass = pojo.getClass(); // Get all getters Map<String, Method> getters = Arrays.stream(pojoClass.getMethods()) .filter(RouteHandlerWrapper::isGetter) + .limit(100) // Prevent excessive field processing .collect(Collectors.toMap(RouteHandlerWrapper::getFieldNameFromGetter, method -> method));
288-333: 🛠️ Refactor suggestionAdd size limits and handle nested structures.
Previous map size limit concern still applies.
Add list size limit and handle nested structures:
// Handle Lists if (List.class.isAssignableFrom(valueClass)) { + if (((List<?>) value).size() > 1000) { + throw new IllegalArgumentException("List size exceeds maximum limit of 1000 entries"); + } return ((List<?>) value).stream() .map(RouteHandlerWrapper::convertToString) + .limit(1000) .collect(Collectors.joining(",")); }
🧹 Nitpick comments (1)
service_commons/src/main/java/ai/chronon/service/RouteHandlerWrapper.java (1)
273-286: Add method name validation.Add prefix validation to prevent processing invalid method names.
private static String getFieldNameFromGetter(Method method) { String methodName = method.getName(); + if (!methodName.startsWith("get") && !methodName.startsWith("is")) { + throw new IllegalArgumentException("Method name must start with 'get' or 'is': " + methodName); + } String fieldName;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (2)
.bazelrc.local(1 hunks)service_commons/src/main/java/ai/chronon/service/RouteHandlerWrapper.java(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- .bazelrc.local
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: non_spark_tests
- GitHub Check: scala_compile_fmt_fix
- GitHub Check: enforce_triggered_workflows
- GitHub Check: build-and-push
🔇 Additional comments (1)
service_commons/src/main/java/ai/chronon/service/RouteHandlerWrapper.java (1)
264-271: LGTM!Correctly implements Java Bean getter validation.
nikhil-zlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kumar-zlai can you take this over - we just want the agent code pushed in - please kill everything else.
Summary
hodge podge of necessary changes - will cleanup in a bit
Checklist
Summary by CodeRabbit