-
Notifications
You must be signed in to change notification settings - Fork 9
Python compile and API skeleton/docs #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis pull request enhances the Chronon CLI documentation and repository management functionality. It introduces comprehensive documentation for CLI commands such as Changes
Possibly related PRs
Suggested Reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (2)
cli_readme.md (2)
4-10: Enhance Plan command documentation clarityThe explanation of the
plancommand could be more precise and include:
- What specific types of changes are tracked (schema changes, transformations, etc.)
- A more structured example with before/after scenarios
- Clearer explanation of environment behavior
Consider expanding the explanation like this:
The `plan` command analyzes and reports: - Schema changes (added/removed columns) - Transformation logic changes - Dependency changes - Configuration updates Example: ```bash # Initial GroupBy configuration zipline plan > No changes detected # After adding a new column 'revenue' zipline plan > Changes detected: > - GroupBy 'daily_metrics': Added column 'revenue'🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
6-6: Fix grammar and style issuesSeveral grammar improvements are needed:
- Line 6: Separate "however" with semicolons or periods
- Line 8: Add comma after "By default"
- Line 21: Change "This produces" to "These produce"
- Line 39: Remove "of" for conciseness
Apply these corrections:
-plan, you would see no change to your `GroupBy` `backfill` plan, however, there would still be +plan, you would see no change to your `GroupBy` `backfill` plan; however, there would still be -By default it is the branch +By default, it is the branch -This produces computed values +These produce computed values -it will run for all of the `GroupBy`s +it will run for all the `GroupBy`sAlso applies to: 8-8, 21-21, 39-39
🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
cli_readme.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.35.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
|
|
||
| # Plan | ||
|
|
||
| Describes the logical changes in your current environment versus the last run for each of the changed entities. | ||
|
|
||
| For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. | ||
|
|
||
| `plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment, reflecting what would happen should you merge and deploy your change. | ||
|
|
||
| Note that `plan` does not fully describe any particular computation because it is agnostic of date ranges and other computation-specific arguments. To see a full compute plan for a given `backfill`, for example, you can call the `backfill` command with the `--plan` argument. | ||
|
|
||
| ### Usage: `zipline plan [OPTIONS]` | ||
|
|
||
| Options: | ||
| ``` | ||
| --env the environment against which to compare the local state. Can be set to `prod` or a different `branch_id`. Defaults to the current branch if one is set, else `prod`. | ||
| ``` | ||
|
|
||
| # Backfill | ||
|
|
||
| Runs a backfill for the specified entity and date range. This produces computed values for the `Entity`s defined transformation. Commonly used with `Join` to produce point-in-time correct historical data of feature values. For `GroupBy`s it produces snapshots of values as of some boundary (usually midnight on each day in the range), and for `StagingQuery` it simply runs the provided SQL to produce output in the range. | ||
|
|
||
| ### Usage: `zipline backfill ENTITY_ID [OPTIONS]` | ||
|
|
||
| ### Valid entity types: `GroupBy`, `Join`, `StagingQuery` | ||
|
|
||
| Options: | ||
|
|
||
| ``` | ||
| --env the environment to backfill into. Defaults to the current branch if one is set, otherwise is a required argument. If set to `prod` then it will overwrite production tables - use with caution. | ||
| --start start date for which you want data backfilled for this entity. Defaults to the configured start date. | ||
| --end end date for which you want data backfilled for this entity. Defaults to today - 2 days. | ||
| --force-recompute recomputes the backfill for the specified entity, even if the date range is already present. For Joins, also recomputes all join_part intermediate tables. Defaults to false. | ||
| --plan visualizes the computation entailed in this backfill, but does not start compute. Useful to sanity check a job prior to running. If everything looks ok, then rerun the command but omit this flag to begin the job. | ||
| ``` | ||
|
|
||
| # Upload | ||
|
|
||
| Populates the online serving index with data for the specified `Entity`. If run for a `Join`, it will run for all of the `GroupBy`s included in that join, as well as run the `Join` metadata upload job, which is required for fetching data for the `Join`. | ||
|
|
||
| After calling `Upload` for any `Entity`, you can then call `fetch` to get values. | ||
|
|
||
| ### Usage: `zipline upload ENTITY_ID [OPTIONS]` | ||
|
|
||
| ### Valid entity types: `GroupBy`, `Join` | ||
|
|
||
| Options: | ||
|
|
||
| ``` | ||
| --ds The date to use for the batch upload. Defaults to | ||
| --stream Only applies to `GroupBy`s that use a streaming source. Runs the streaming job after the batch upload completes (or only runs the streaming job if the batch upload is already completed for the given `ds`). | ||
| --plan visualizes the computation entailed in this upload, but does not start compute. Useful to sanity check a job prior to running. If everything looks ok, then rerun the command but omit this flag to begin the job. | ||
| ``` | ||
|
|
||
|
|
||
| # Fetch | ||
|
|
||
| Fetches data for the given `Entity` and keys. Useful for testing online serving. | ||
|
|
||
| ### Usage: `zipline upload ENTITY_ID [OPTIONS]` | ||
|
|
||
| ### Valid entity types: `GroupBy`, `Join` | ||
|
|
||
| Options: | ||
|
|
||
| ``` | ||
| --keys the keys to use in the fetch request, map of key name to value. Required argument. | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve document structure and formatting
The document needs some structural improvements:
- Add a main title at the top (e.g., "Zipline CLI Documentation")
- Fix heading hierarchy - use consistent levels (h1 for main title, h2 for commands, h3 for subsections)
- Add language specifiers to code blocks (e.g., ```bash for command examples)
Here's how to fix the heading structure:
+# Zipline CLI Documentation
+
-# Plan
+## Plan
-### Usage: `zipline plan [OPTIONS]`
+#### Usage: `zipline plan [OPTIONS]`
-# Backfill
+## Backfill
-### Usage: `zipline backfill ENTITY_ID [OPTIONS]`
+#### Usage: `zipline backfill ENTITY_ID [OPTIONS]`And for code blocks:
-```
+```bash
--env the environment against which to compare...
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 LanguageTool</summary>
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your `GroupBy` `backfill` plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
---
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
---
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to `prod` in which c...
(HOWEVER_SENTENCE)
---
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
---
[style] ~39-~39: Consider removing “of” to be more concise
Context: ...`. If run for a `Join`, it will run for all of the `GroupBy`s included in that join, as we...
(ALL_OF_THE)
</details>
<details>
<summary>🪛 Markdownlint (0.35.0)</summary>
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
---
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
---
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
---
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
---
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
---
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<!-- This is an auto-generated comment by CodeRabbit -->
| Runs a backfill for the specified entity and date range. This produces computed values for the `Entity`s defined transformation. Commonly used with `Join` to produce point-in-time correct historical data of feature values. For `GroupBy`s it produces snapshots of values as of some boundary (usually midnight on each day in the range), and for `StagingQuery` it simply runs the provided SQL to produce output in the range. | ||
|
|
||
| ### Usage: `zipline backfill ENTITY_ID [OPTIONS]` | ||
|
|
||
| ### Valid entity types: `GroupBy`, `Join`, `StagingQuery` | ||
|
|
||
| Options: | ||
|
|
||
| ``` | ||
| --env the environment to backfill into. Defaults to the current branch if one is set, otherwise is a required argument. If set to `prod` then it will overwrite production tables - use with caution. | ||
| --start start date for which you want data backfilled for this entity. Defaults to the configured start date. | ||
| --end end date for which you want data backfilled for this entity. Defaults to today - 2 days. | ||
| --force-recompute recomputes the backfill for the specified entity, even if the date range is already present. For Joins, also recomputes all join_part intermediate tables. Defaults to false. | ||
| --plan visualizes the computation entailed in this backfill, but does not start compute. Useful to sanity check a job prior to running. If everything looks ok, then rerun the command but omit this flag to begin the job. | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add critical operational details to Backfill command
The backfill documentation is missing important operational details:
- Expected date formats for --start and --end
- Error handling and recovery procedures
- Concurrent execution behavior
- Resource requirements and execution time expectations
Please add these missing details. For example:
Options:
--start Start date in YYYY-MM-DD format
--end End date in YYYY-MM-DD format
Error Handling:
- If a backfill fails, it can be safely retried
- Use --force-recompute to restart a failed job from the beginning
Performance:
- Multiple backfills can run concurrently for different entities
- Large date ranges may take several hours to complete🧰 Tools
🪛 LanguageTool
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
🪛 Markdownlint (0.35.0)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
| --ds The date to use for the batch upload. Defaults to | ||
| --stream Only applies to `GroupBy`s that use a streaming source. Runs the streaming job after the batch upload completes (or only runs the streaming job if the batch upload is already completed for the given `ds`). | ||
| --plan visualizes the computation entailed in this upload, but does not start compute. Useful to sanity check a job prior to running. If everything looks ok, then rerun the command but omit this flag to begin the job. | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix incomplete option description and add missing details
The --ds option description is cut off, and critical operational details are missing.
Fix the incomplete description and add missing details:
--- --ds The date to use for the batch upload. Defaults to
+-- --ds The date to use for the batch upload. Defaults to current date. Format: YYYY-MM-DD
+--
+-- Important Notes:
+-- - Ensure backfill is complete before uploading
+-- - Streaming jobs will process data after the specified --ds date
+-- - Data consistency checks are performed before upload📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| --ds The date to use for the batch upload. Defaults to | |
| --stream Only applies to `GroupBy`s that use a streaming source. Runs the streaming job after the batch upload completes (or only runs the streaming job if the batch upload is already completed for the given `ds`). | |
| --plan visualizes the computation entailed in this upload, but does not start compute. Useful to sanity check a job prior to running. If everything looks ok, then rerun the command but omit this flag to begin the job. | |
| ``` | |
| --ds The date to use for the batch upload. Defaults to current date. Format: YYYY-MM-DD | |
| Important Notes: | |
| - Ensure backfill is complete before uploading | |
| - Streaming jobs will process data after the specified --ds date | |
| - Data consistency checks are performed before upload | |
| --stream Only applies to `GroupBy`s that use a streaming source. Runs the streaming job after the batch upload completes (or only runs the streaming job if the batch upload is already completed for the given `ds`). | |
| --plan visualizes the computation entailed in this upload, but does not start compute. Useful to sanity check a job prior to running. If everything looks ok, then rerun the command but omit this flag to begin the job. |
|
|
||
| ### Usage: `zipline upload ENTITY_ID [OPTIONS]` | ||
|
|
||
| ### Valid entity types: `GroupBy`, `Join` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the user need to specify an entity type if we're already asking for the entity_id? Or do we want to skip a lookup by asking them to specify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I was thinking they don't need to specify the type, but just documenting it here. If they try on a different entity we would just err.
cli_readme.md
Outdated
|
|
||
| Options: | ||
| ``` | ||
| --env the environment against which to compare the local state. Can be set to `prod` or a different `branch_id`. Defaults to the current branch if one is set, else `prod`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like branches and env are used interchangeably here - I think what we're really shooting for is env rather than branch. The distinction is that these environments are global (and probably lower cardinality than literal git "branches"), and multiple users would develop against the same environment, and their changes would be seen by others. Does that match your understanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yeah that' a good simplification, we should update accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env is too general imo - users and our code will be referring to git branches any way - and we probably also will map env to branch 1:1. I think branch is a simpler and easier term to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we capture legit environments like qa? The branch master in qa and master in prod in most companies tend to run slightly different jobs (qa master might be test GropuBys / Joins as the prod datasets arent present in qa)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
cli_readme.md (2)
21-21: Fix grammar in descriptionChange "This produces" to "These produce" for better grammar as it refers to multiple computed values.
-This produces computed values for the `Entity`s defined transformation. +These produce computed values for the `Entity`s defined transformation.🧰 Tools
🪛 LanguageTool
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...(THIS_NNS_VB)
39-39: Improve concisenessRemove unnecessary words while maintaining clarity.
-If run for a `Join`, it will run for all of the `GroupBy`s included in that join +If run for a `Join`, it will run for all `GroupBy`s included in that join🧰 Tools
🪛 LanguageTool
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for aJoin, it will run for all of theGroupBy`s included in that join, as we...(ALL_OF_THE)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
cli_readme.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (2)
cli_readme.md (2)
1-68: Improve document structure and formatting
The document structure needs improvement:
- Add a main title (e.g., "Zipline CLI Documentation")
- Fix heading hierarchy (use h1 for title, h2 for commands, h3 for subsections)
- Add language specifiers to code blocks (e.g., ```bash)
🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
27-35: Add critical operational details
The backfill documentation needs additional operational details:
- Date format specifications
- Error handling procedures
- Performance expectations
- Resource requirements
🧰 Tools
🪛 Markdownlint (0.37.0)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
| For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. | ||
|
|
||
| `plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment, reflecting what would happen should you merge and deploy your change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve clarity and fix grammatical issues
The sentences using "however" need restructuring for better readability. Additionally, based on previous discussions, we should clarify the distinction between environments and branches.
-If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan.
+If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan. However, there would still be a column addition reflected in your `upload` plan.
-`plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment
+`plan` is always run against a particular environment. By default, it runs against your current development environment. However, it can also be set to `prod`, in which case it will display the plan of your local changes against the production environment📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. | |
| `plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment, reflecting what would happen should you merge and deploy your change. | |
| For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan. However, there would still be a column addition reflected in your `upload` plan. | |
| `plan` is always run against a particular environment. By default, it runs against your current development environment. However, it can also be set to `prod`, in which case it will display the plan of your local changes against the production environment, reflecting what would happen should you merge and deploy your change. |
🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
cda621b to
88149e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
api/py/ai/chronon/api_helpers.py (3)
5-6: Consider passing OUTPUT_ROOT from environment config.Hardcoding "production" may not suit all environments.
8-22: Make target directories configurable or documented.It might be safer to externalize or document the set of target dirs.
72-75: Rename or remove the test function.It only prints the caller file. If it's for debugging, clarify its purpose.
cli_readme.md (4)
6-6: Rephrase “however” for clarity.Example: use a period and start a new sentence instead, per style guidelines.
-..., however, there would ... +... . However, there would ...🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
8-8: Add comma after “By default”.Enhances sentence readability.
-By default it is +By default, it is🧰 Tools
🪛 LanguageTool
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
12-12: Adopt consistent heading levels.Use H2 for major sections, H3 for sub-sections. Current usage triggers lint warnings.
Also applies to: 23-23, 43-43, 60-60
🧰 Tools
🪛 Markdownlint (0.37.0)
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time(MD001, heading-increment)
15-15: Specify language for fenced code blocks.Markdownlint suggests adding a language type (e.g., ```bash).
Also applies to: 29-29, 49-49, 66-66
🧰 Tools
🪛 Markdownlint (0.37.0)
15-15: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (3)
api/py/ai/chronon/api_helpers.py(1 hunks)api/py/ai/chronon/group_by.py(2 hunks)cli_readme.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (6)
api/py/ai/chronon/api_helpers.py (3)
24-45: Handle errors from extract_and_convert.callback.
Add try-catch or a validation step to handle any failures gracefully.
47-65: Ensure variable_name is always defined.
If it's missing, the returned path might be invalid.
67-70: Unused parameters in backfill.
They're never referenced in the function body. Confirm usage or remove them.
api/py/ai/chronon/group_by.py (2)
19-19: Import looks fine.
Ensures backfill is directly accessible. No issues found.
589-590: Binding backfill to group_by is correct.
Enables in-place usage of the method. Looks good.
cli_readme.md (1)
60-60:
Correct usage label.
It should say “fetch” rather than “upload” under the Fetch heading.
-### Usage: `zipline upload ENTITY_ID [OPTIONS]`
+### Usage: `zipline fetch ENTITY_ID [OPTIONS]`Likely invalid or redundant comment.
🧰 Tools
🪛 Markdownlint (0.37.0)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
api/py/ai/chronon/repo/compilev2.py (2)
52-55: Refactor to a ternary operator.
Shorten the log level assignment:- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
52-55: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
62-62: Remove unnecessary.keys().
Useif d in FOLDER_NAME_TO_CLASS:for clarity.- obj_folder_names = [d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS.keys()] + obj_folder_names = [d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS]🧰 Tools
🪛 Ruff (0.8.2)
62-62: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
api/py/ai/chronon/repo/compile.py (1)
94-94: Remove.keys()usage.- obj_folder_names = [d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS.keys()] + obj_folder_names = [d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS]🧰 Tools
🪛 Ruff (0.8.2)
94-94: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
api/py/ai/chronon/group_by.py (1)
23-23: Remove unused import.
typesisn't referenced in the final implementation.- import types🧰 Tools
🪛 Ruff (0.8.2)
23-23:
typesimported but unusedRemove unused import:
types(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (5)
api/py/ai/chronon/api_helpers.py(1 hunks)api/py/ai/chronon/group_by.py(2 hunks)api/py/ai/chronon/repo/compile.py(1 hunks)api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/utils.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/compile.py
94-94: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
api/py/ai/chronon/repo/compilev2.py
52-55: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
62-62: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
81-81: Undefined name input_path
(F821)
api/py/ai/chronon/group_by.py
23-23: types imported but unused
Remove unused import: types
(F401)
🔇 Additional comments (3)
api/py/ai/chronon/utils.py (1)
187-200: Validate dynamic imports and performance.
Ensure error handling around importlib.import_module(mod_name) in case the module is missing or invalid. Also verify that frequent gc.collect() calls won't impair performance.
api/py/ai/chronon/api_helpers.py (1)
1-39: General checks
This addition looks consistent. No immediate issues found.
api/py/ai/chronon/group_by.py (1)
589-589: Consider enabling the backfill binding.
It is commented out. If it’s intentional, remove the unused code. Otherwise, uncomment it for functionality.
api/py/ai/chronon/repo/compilev2.py
Outdated
| elif os.path.isfile(full_input_path): | ||
| assert full_input_path.endswith( | ||
| ".py" | ||
| ), f"Input Path: {input_path} isn't a python file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undefined variable input_path.
Replace input_path with the correct variable or define it in scope.
- assert full_input_path.endswith(".py"), f"Input Path: {input_path} isn't a python file"
+ assert full_input_path.endswith(".py"), f"Input Path: {full_input_path} isn't a python file"Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
81-81: Undefined name input_path
(F821)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (12)
api/py/ai/chronon/repo/runner.py (2)
7-7: Make OUTPUT_ROOT configurable- OUTPUT_ROOT = "production" + OUTPUT_ROOT = os.getenv("OUTPUT_ROOT", "production")
63-67: Backfill not implementedLet me know if you'd like a stub implementation or usage example.
api/py/ai/chronon/repo/extract_objects.py (2)
Line range hint
25-37: Redundant folder processing
from_folderV2partly duplicatesfrom_folder; consider merging functionality.
43-66: Error captureIf multiple files fail, consider collating failures into a single error log for better visibility.
api/py/ai/chronon/repo/compile.py (2)
92-92: Drop.keys()- d in FOLDER_NAME_TO_CLASS.keys() + d in FOLDER_NAME_TO_CLASS
123-125: Raise specialized exceptionUse a more specific exception type for clarity.
api/py/ai/chronon/repo/compilev2.py (6)
55-58: Use ternary- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
55-58: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
65-65: Simplify membership check- if d in FOLDER_NAME_TO_CLASS.keys(): + if d in FOLDER_NAME_TO_CLASS:🧰 Tools
🪛 Ruff (0.8.2)
65-65: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
109-109: Unused variableRemove
target_object_writtenif not used.🧰 Tools
🪛 Ruff (0.8.2)
109-109: Local variable
target_object_writtenis assigned to but never usedRemove assignment to unused variable
target_object_written(F841)
152-152: Unused variableRemove
eif it's not used.🧰 Tools
🪛 Ruff (0.8.2)
152-152: Local variable
eis assigned to but never usedRemove assignment to unused variable
e(F841)
189-189: Simplify membership check- for file in compile_errors.keys() + for file in compile_errors🧰 Tools
🪛 Ruff (0.8.2)
189-189: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
273-273: Unused variable
class_nameis never used. Remove it.🧰 Tools
🪛 Ruff (0.8.2)
273-273: Local variable
class_nameis assigned to but never usedRemove assignment to unused variable
class_name(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (4)
api/py/ai/chronon/repo/compile.py(1 hunks)api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/repo/extract_objects.py(3 hunks)api/py/ai/chronon/repo/runner.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/compile.py
94-94: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
65-65: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
109-109: Local variable target_object_written is assigned to but never used
Remove assignment to unused variable target_object_written
(F841)
152-152: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
189-189: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
273-273: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (3)
api/py/ai/chronon/repo/runner.py (1)
31-31: Remove hardcoded "group_bys" if other types are expected
Ensure this reflects all relevant object types.
api/py/ai/chronon/repo/extract_objects.py (1)
83-83: from_file usage
No issues spotted here.
api/py/ai/chronon/repo/compile.py (1)
98-100: Index safety
path_split[0] could fail if path_split is empty. Verify input handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
api/py/ai/chronon/repo/compilev2.py (4)
105-106: Remove unused variable.The variable
target_object_writtenis assigned but never used.- if obj == target_object: - target_object_written = True🧰 Tools
🪛 Ruff (0.8.2)
106-106: Local variable
target_object_writtenis assigned to but never usedRemove assignment to unused variable
target_object_written(F841)
149-150: Use underscore for unused exception.Replace unused exception variable with underscore.
- except Exception as e: + except Exception as _:🧰 Tools
🪛 Ruff (0.8.2)
149-149: Local variable
eis assigned to but never usedRemove assignment to unused variable
e(F841)
186-186: Simplify dictionary iteration.Remove redundant
.keys()call.- "\n".join([f"- {os.path.relpath(file, chronon_root_path)}" for file in compile_errors.keys()]) + + "\n".join([f"- {os.path.relpath(file, chronon_root_path)}" for file in compile_errors]) +🧰 Tools
🪛 Ruff (0.8.2)
186-186: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
259-259: Remove unused variable.The variable
class_nameis assigned but never used.- class_name = obj_class.__name__🧰 Tools
🪛 Ruff (0.8.2)
259-259: Local variable
class_nameis assigned to but never usedRemove assignment to unused variable
class_name(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
api/py/ai/chronon/repo/compilev2.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
65-65: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
106-106: Local variable target_object_written is assigned to but never used
Remove assignment to unused variable target_object_written
(F841)
149-149: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
186-186: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
259-259: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (2)
api/py/ai/chronon/repo/compilev2.py (2)
190-202: LGTM!
Clean implementation with proper type hints and nested metadata handling.
233-256: LGTM!
Robust implementation with proper validation and error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
api/py/ai/chronon/repo/runner.py (3)
13-23: Consider adding error handling for directory traversal.The while loop could potentially reach the root directory without finding target directories.
while current_path != os.path.dirname(current_path): # Stop at root directory + if current_path == '/': # Additional safety check + break dir_name = os.path.basename(current_path)
31-37: Add type hints for better code maintainability.The function parameters and return values lack type annotations.
-def get_target_compiled_file(zipline_obj): +def get_target_compiled_file(zipline_obj: Any) -> Tuple[str, str, str]:
102-103: Add logging for better observability.The placeholder implementations should log their status.
- print("\n\n TODO -- Implement \n\n") + logging.info("TODO: Implement upload functionality")Also applies to: 117-118, 134-134
api/py/ai/chronon/repo/compilev2.py (1)
55-58: Simplify log level assignment.Use ternary operator for cleaner code.
- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
55-58: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (2)
api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/repo/runner.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
65-65: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
106-106: Local variable target_object_written is assigned to but never used
Remove assignment to unused variable target_object_written
(F841)
149-149: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
186-186: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
259-259: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (1)
api/py/ai/chronon/repo/compilev2.py (1)
291-292:
Add required arguments to main function.
The function call is missing required arguments.
if __name__ == "__main__":
- extract_and_convert()
+ import argparse
+ parser = argparse.ArgumentParser()
+ parser.add_argument("--chronon-root", required=True)
+ parser.add_argument("--output-root", required=True)
+ parser.add_argument("--target-object", required=True)
+ parser.add_argument("--target-object-file", required=True)
+ args = parser.parse_args()
+ extract_and_convert(
+ args.chronon_root,
+ args.output_root,
+ args.target_object,
+ args.target_object_file
+ )Likely invalid or redundant comment.
api/py/ai/chronon/repo/runner.py
Outdated
| module_file, compiled_file, chronon_root = get_target_compiled_file(zipline_obj) | ||
| extract_and_convert(chronon_root, chronon_root, zipline_obj, module_file) | ||
| # TODO: actually kick off backfill for compiled_file | ||
| print("\n\n TODO -- Implement \n\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Add compilation status check before backfill
The extract_and_convert function in compilev2.py already handles compilation failures for the target object by exiting with a failure message. However, the runner code doesn't verify this status.
- Add error handling around
extract_and_convertcall inrunner.pyto verify compilation success - Check if
compiled_fileexists before proceeding with backfill
🔗 Analysis chain
Verify compilation success before proceeding.
The backfill function should verify if compilation was successful.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if compiled file exists
fd -t f "$(basename $compiled_file)" "$(dirname $compiled_file)"
Length of output: 282
Script:
#!/bin/bash
# Find extract_and_convert function implementation
ast-grep --pattern 'def extract_and_convert($$$)'
# Find get_target_compiled_file function implementation
ast-grep --pattern 'def get_target_compiled_file($$$)'
Length of output: 108
Script:
#!/bin/bash
# Search for function implementations
rg "def (extract_and_convert|get_target_compiled_file)" -A 10
Length of output: 2564
nikhil-zlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits and some refactoring.
I also think coderabbit suggestions on readme are reasonable
api/py/ai/chronon/group_by.py
Outdated
| import ai.chronon.api.common.ttypes as common | ||
| import ai.chronon.utils as utils | ||
| from copy import deepcopy | ||
| #from ai.chronon.api_helpers import backfill |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup
api/py/ai/chronon/group_by.py
Outdated
| import logging | ||
| import inspect | ||
| import json | ||
| import types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup
api/py/ai/chronon/group_by.py
Outdated
| derivations=derivations, | ||
| ) | ||
| validate_group_by(group_by) | ||
| #group_by.backfill = types.MethodType(backfill, group_by) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup
api/py/ai/chronon/repo/runner.py
Outdated
| # Get the file where zipline_obj is defined | ||
| # zipline_obj_file = inspect.getmodule(zipline_obj).__file__ | ||
| # if not zipline_obj_file: | ||
| # raise ValueError("Could not determine file location for zipline_obj") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
api/py/ai/chronon/repo/runner.py
Outdated
| zipline_obj_module, zipline_var_name = get_mod_and_var_name_from_gc(zipline_obj, zipline_obj_type) | ||
|
|
||
| if zipline_obj_module is None or zipline_var_name is None: | ||
| raise ValueError("Could not determine module and variable name for object") | ||
|
|
||
| import importlib | ||
| module = importlib.import_module(zipline_obj_module) | ||
|
|
||
| # find the file of the module | ||
| module_file = inspect.getmodule(module).__file__ | ||
| if not module_file: | ||
| raise ValueError(f"Could not determine file location for module {zipline_obj_module}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this whole bit is better encapsulated as a dataclass
@dataclass
class ConfigDetails:
module: str
variable: str
file: str
compiled_file: str
def __init__(self, obj, cls):
self.module = ...
self.variable = ...
self.file = ...
self.compiled_file = ...
cli_readme.md
Outdated
|
|
||
| Populates the online serving index with data for the specified `Entity`. If run for a `Join`, it will run for all of the `GroupBy`s included in that join, as well as run the `Join` metadata upload job, which is required for fetching data for the `Join`. | ||
|
|
||
| After calling `Upload` for any `Entity`, you can then call `fetch` to get values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need the streaming job deployed too in some scenarios so we'll need upload + stream (or deploy) prior to fetching..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
cli_readme.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (4)
cli_readme.md (4)
1-1: Add main title and improve document structureAdd a main title (e.g., "Zipline CLI Documentation") at the top.
29-35: Add missing operational details for backfillAdd date formats, error handling, and performance expectations.
🧰 Tools
🪛 Markdownlint (0.37.0)
29-29: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
6-8: 🛠️ Refactor suggestionFix grammar and clarify environment terminology
Previous discussions indicate confusion about branches vs environments. Let's clarify this and fix grammar.
-For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. +For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan. However, there would still be a column addition reflected in your `upload` plan. -`plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment +`plan` always runs against a specific environment (e.g., development, qa, prod). By default, it uses your current environment. However, when set to `prod`, it will display how your local changes would affect the production environmentLikely invalid or redundant comment.
🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
50-50:⚠️ Potential issueComplete the --ds option description
Based on previous discussion, complete the description.
--ds The date to use for the batch upload. Defaults to two days prior to today (UTC)Likely invalid or redundant comment.
| ### Usage: `zipline upload ENTITY_ID [OPTIONS]` | ||
|
|
||
| ### Valid entity types: `GroupBy`, `Join` | ||
|
|
||
| Options: | ||
|
|
||
| ``` | ||
| --keys the keys to use in the fetch request, map of key name to value. Required argument. | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix command usage and add examples
Command shows incorrect usage and lacks examples.
-### Usage: `zipline upload ENTITY_ID [OPTIONS]`
+### Usage: `zipline fetch ENTITY_ID [OPTIONS]`
+Example:
+```bash
+zipline fetch user_metrics --keys '{"user_id": "12345"}'
+```🧰 Tools
🪛 Markdownlint (0.37.0)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
cli_readme.md
Outdated
|
|
||
| Populates the online serving index with data for the specified `Entity`. If run for a `Join`, it will run for all of the `GroupBy`s included in that join, as well as run the `Join` metadata upload job, which is required for fetching data for the `Join`. | ||
|
|
||
| After calling `Upload` for any `Entity`, you can then call `fetch` to get values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Clarify streaming job requirements
Add note about streaming job deployment requirements.
-After calling `Upload` for any `Entity`, you can then call `fetch` to get values.
+After calling `Upload` for any `Entity` and ensuring any required streaming jobs are deployed, you can then call `fetch` to get values.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| After calling `Upload` for any `Entity`, you can then call `fetch` to get values. | |
| After calling `Upload` for any `Entity` and ensuring any required streaming jobs are deployed, you can then call `fetch` to get values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
api/py/ai/chronon/repo/compilev2.py (2)
116-129: Extract online join validation logic.Move validation logic to a separate function for better maintainability.
+def _validate_online_join(obj: Join): + offline_gbs = [jp.groupBy.metaData.name for jp in obj.joinParts if not jp.groupBy.metaData.online] + assert not offline_gbs, ( + "You must make all dependent GroupBys `online` if you want to make your join `online`. " + "You can do this by passing the `online=True` argument to the GroupBy constructor. " + "Fix the following: {}".format(offline_gbs) + ) +
146-155: Track ZiplineHub integration TODO.The TODO comment indicates missing ZiplineHub integration for lineage tracking.
Would you like me to create a GitHub issue to track the ZiplineHub integration task?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
api/py/ai/chronon/repo/compilev2.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
67-67: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
113-113: Local variable target_object_written is assigned to but never used
Remove assignment to unused variable target_object_written
(F841)
156-156: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
193-193: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
266-266: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (3)
api/py/ai/chronon/repo/compilev2.py (3)
156-159: 🛠️ Refactor suggestionInclude error details in warning message.
Add exception details to help with debugging.
- except Exception as e: - - _print_warning("Failed to connect to ZiplineHub (check connection and VPN settings). \n\n " + + except Exception as e: + _print_warning(f"Failed to connect to ZiplineHub: {str(e)}\nCheck connection and VPN settings.\n\n" +Likely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
156-156: Local variable
eis assigned to but never usedRemove assignment to unused variable
e(F841)
112-113: 🛠️ Refactor suggestionRemove unused variable.
Replace with logging statement.
- if obj == target_object: - target_object_written = True + if obj == target_object: + logging.info(f"Successfully wrote target object: {target_object}")Likely invalid or redundant comment.
🧰 Tools
🪛 Ruff (0.8.2)
113-113: Local variable
target_object_writtenis assigned to but never usedRemove assignment to unused variable
target_object_written(F841)
298-299:⚠️ Potential issueAdd required arguments to main function.
The function requires mandatory arguments that are missing.
if __name__ == "__main__": - extract_and_convert() + import argparse + parser = argparse.ArgumentParser(description='Compile Chronon repository') + parser.add_argument('--chronon-root', required=True, help='Path to chronon root') + parser.add_argument('--output-root', required=True, help='Path to output root') + parser.add_argument('--target-object', required=True, help='Target object name') + parser.add_argument('--target-object-file', required=True, help='Target object file') + parser.add_argument('--debug', action='store_true', help='Enable debug logging') + args = parser.parse_args() + + extract_and_convert( + chronon_root=args.chronon_root, + output_root=args.output_root, + target_object=args.target_object, + target_object_file=args.target_object_file, + debug=args.debug + )Likely invalid or redundant comment.
e70963a to
03443f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
api/py/ai/chronon/repo/extract_objects.py (1)
43-64: Consider adding docstring parameters.The docstring should document the
target_fileparameter and return values.def from_folderV2(full_path: str, target_file: str, cls: type): """ Recursively consumes a folder, and constructs a map of object qualifier to StagingQuery, GroupBy, or Join + + Args: + full_path: Path to the folder to process + target_file: Specific file to track errors for + cls: Type of object to extract + + Returns: + Tuple containing: + - Dict mapping object qualifiers to (object, source_file) tuples + - Dict mapping files to their errors + - Error specific to target_file if any """api/py/ai/chronon/repo/runner.py (2)
7-10: Remove unused imports.These folder name constants are imported but never used.
from ai.chronon.repo import ( FOLDER_NAME_TO_CLASS, - JOIN_FOLDER_NAME, - GROUP_BY_FOLDER_NAME, - STAGING_QUERY_FOLDER_NAME, - MODEL_FOLDER_NAME, OUTPUT_ROOT )🧰 Tools
🪛 Ruff (0.8.2)
7-7:
ai.chronon.repo.JOIN_FOLDER_NAMEimported but unusedRemove unused import
(F401)
8-8:
ai.chronon.repo.GROUP_BY_FOLDER_NAMEimported but unusedRemove unused import
(F401)
9-9:
ai.chronon.repo.STAGING_QUERY_FOLDER_NAMEimported but unusedRemove unused import
(F401)
10-10:
ai.chronon.repo.MODEL_FOLDER_NAMEimported but unusedRemove unused import
(F401)
62-66: Remove commented code.Delete the unused code block for finding zipline_obj file.
- # Get the file where zipline_obj is defined - # zipline_obj_file = inspect.getmodule(zipline_obj).__file__ - # if not zipline_obj_file: - # raise ValueError("Could not determine file location for zipline_obj")api/py/ai/chronon/repo/compilev2.py (4)
55-58: Simplify log level assignment.Use a ternary operator for cleaner code.
- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
55-58: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
67-67: Optimize dictionary key check.Use direct key membership test.
- obj_folder_names = [d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS.keys()] + obj_folder_names = [d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS]🧰 Tools
🪛 Ruff (0.8.2)
67-67: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
133-143: Add TODO tracking for ZiplineHub integration.The placeholder implementation needs tracking.
Would you like me to create a GitHub issue to track the ZiplineHub integration task?
253-264: Remove unused variable.The
class_namevariable is assigned but never used.def _write_obj_as_json(name: str, obj: object, output_file: str, obj_class: type): - class_name = obj_class.__name__ output_folder = os.path.dirname(output_file)🧰 Tools
🪛 Ruff (0.8.2)
254-254: Local variable
class_nameis assigned to but never usedRemove assignment to unused variable
class_name(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (4)
api/py/ai/chronon/repo/__init__.py(1 hunks)api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/repo/extract_objects.py(3 hunks)api/py/ai/chronon/repo/runner.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- api/py/ai/chronon/repo/init.py
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/runner.py
7-7: ai.chronon.repo.JOIN_FOLDER_NAME imported but unused
Remove unused import
(F401)
8-8: ai.chronon.repo.GROUP_BY_FOLDER_NAME imported but unused
Remove unused import
(F401)
9-9: ai.chronon.repo.STAGING_QUERY_FOLDER_NAME imported but unused
Remove unused import
(F401)
10-10: ai.chronon.repo.MODEL_FOLDER_NAME imported but unused
Remove unused import
(F401)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
67-67: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
181-181: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
254-254: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (3)
api/py/ai/chronon/repo/extract_objects.py (1)
Line range hint
25-41: LGTM! Clean implementation of folder traversal.The function correctly processes Python files recursively and handles exceptions appropriately.
api/py/ai/chronon/repo/runner.py (2)
14-28: LGTM! Robust root directory detection.The function correctly traverses up the directory tree to find the Chronon root.
90-93: Add compilation status check before backfill.Verify compilation success before proceeding with backfill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (9)
cli_readme.md (3)
1-2: Add title and introductionAdd a main title and brief introduction to set context.
+# Zipline CLI Documentation + +Command-line interface for managing Zipline data pipelines. + # Plan
6-8: Fix sentence structureImprove readability by properly structuring sentences with "however".
-If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. +If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan. However, there would still be a column addition reflected in your `upload` plan. -`plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment +`plan` is always run against a particular environment. By default, it runs against your current development environment. However, it can also be set to `prod`, in which case it will display the plan of your local changes against the production environment.Also applies to: 8-8
🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
15-17: Add language specifiers to code blocksAdd bash language specifier to CLI command examples.
-``` +```bash --branch the branch against which to compare...Also applies to: 29-35, 49-53, 66-68 <details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint (0.37.0)</summary> 15-15: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> </blockquote></details> <details> <summary>api/py/ai/chronon/repo/extract_objects.py (1)</summary><blockquote> Line range hint `25-41`: **Improve error handling.** Consider collecting errors in a dictionary and returning them along with results, similar to `from_folderV2`. ```diff def from_folder(full_path: str, cls: type, log_level=logging.INFO): if full_path.endswith("/"): full_path = full_path[:-1] python_files = glob.glob(os.path.join(full_path, "**/*.py"), recursive=True) result = {} + errors = {} for f in python_files: try: result.update(from_file(f, cls, log_level)) except Exception as e: - logging.error(f"Failed to extract: {f}") - logging.exception(e) + errors[f] = e - return result + return result, errorsapi/py/ai/chronon/repo/compilev2.py (2)
128-160: Implement ZiplineHub integration.The function is a placeholder with good error handling but needs implementation.
Would you like me to help implement the ZiplineHub integration for lineage tracking?
253-264: Remove unused variable.The class_name variable is assigned but never used.
def _write_obj_as_json(name: str, obj: object, output_file: str, obj_class: type): - class_name = obj_class.__name__ output_folder = os.path.dirname(output_file)🧰 Tools
🪛 Ruff (0.8.2)
254-254: Local variable
class_nameis assigned to but never usedRemove assignment to unused variable
class_name(F841)
api/py/ai/chronon/repo/runner.py (2)
62-66: Remove commented code block.The commented code block should be removed if no longer needed.
- # Get the file where zipline_obj is defined - # zipline_obj_file = inspect.getmodule(zipline_obj).__file__ - # if not zipline_obj_file: - # raise ValueError("Could not determine file location for zipline_obj")
5-12: Remove unused imports.The following imports are not used:
- JOIN_FOLDER_NAME
- GROUP_BY_FOLDER_NAME
- STAGING_QUERY_FOLDER_NAME
- MODEL_FOLDER_NAME
🧰 Tools
🪛 Ruff (0.8.2)
7-7:
ai.chronon.repo.JOIN_FOLDER_NAMEimported but unusedRemove unused import
(F401)
8-8:
ai.chronon.repo.GROUP_BY_FOLDER_NAMEimported but unusedRemove unused import
(F401)
9-9:
ai.chronon.repo.STAGING_QUERY_FOLDER_NAMEimported but unusedRemove unused import
(F401)
10-10:
ai.chronon.repo.MODEL_FOLDER_NAMEimported but unusedRemove unused import
(F401)
api/py/ai/chronon/utils.py (1)
195-197: Consider performance optimization.The iteration through all module variables could be inefficient for large modules. Consider caching results or using more efficient lookup methods if this becomes a bottleneck.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (6)
api/py/ai/chronon/repo/__init__.py(1 hunks)api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/repo/extract_objects.py(3 hunks)api/py/ai/chronon/repo/runner.py(1 hunks)api/py/ai/chronon/utils.py(1 hunks)cli_readme.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- api/py/ai/chronon/repo/init.py
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/runner.py
7-7: ai.chronon.repo.JOIN_FOLDER_NAME imported but unused
Remove unused import
(F401)
8-8: ai.chronon.repo.GROUP_BY_FOLDER_NAME imported but unused
Remove unused import
(F401)
9-9: ai.chronon.repo.STAGING_QUERY_FOLDER_NAME imported but unused
Remove unused import
(F401)
10-10: ai.chronon.repo.MODEL_FOLDER_NAME imported but unused
Remove unused import
(F401)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
67-67: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
181-181: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
254-254: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (6)
cli_readme.md (2)
60-60:⚠️ Potential issueFix incorrect command in usage section
The fetch command usage shows incorrect command.
-### Usage: `zipline upload ENTITY_ID [OPTIONS]` +### Usage: `zipline fetch ENTITY_ID [OPTIONS]`Likely invalid or redundant comment.
🧰 Tools
🪛 Markdownlint (0.37.0)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time(MD001, heading-increment)
50-50:⚠️ Potential issueComplete the --ds option description
The --ds option description is incomplete.
---ds The date to use for the batch upload. Defaults to +--ds The date to use for the batch upload. Defaults to two days prior to today (UTC)Likely invalid or redundant comment.
api/py/ai/chronon/repo/extract_objects.py (2)
43-64: LGTM! Well-structured error handling.Good separation of target file errors and tracking of file origins in results.
82-82: LGTM! Clean and focused implementation.api/py/ai/chronon/repo/runner.py (2)
14-28: LGTM! Clean directory traversal implementation.
90-93:⚠️ Potential issueAdd error handling for extract_and_convert.
The function should handle potential errors from extract_and_convert.
module_file, compiled_file, chronon_root = get_target_compiled_file(zipline_obj) - extract_and_convert(chronon_root, chronon_root, zipline_obj, module_file) + try: + extract_and_convert(chronon_root, chronon_root, zipline_obj, module_file) + except Exception as e: + raise ValueError(f"Failed to compile {module_file}: {str(e)}") # TODO: actually kick off backfill for compiled_fileLikely invalid or redundant comment.
| def extract_and_convert( | ||
| chronon_root, output_root, target_object, target_object_file, debug=False | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add force_overwrite parameter.
The function uses force_overwrite in _write_obj but it's not in the signature.
def extract_and_convert(
- chronon_root, output_root, target_object, target_object_file, debug=False
+ chronon_root, output_root, target_object, target_object_file, debug=False, force_overwrite=False
):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def extract_and_convert( | |
| chronon_root, output_root, target_object, target_object_file, debug=False | |
| ): | |
| def extract_and_convert( | |
| chronon_root, output_root, target_object, target_object_file, debug=False, force_overwrite=False | |
| ): |
| module = importlib.import_module(mod_name) | ||
| for var_name, value in vars(module).items(): | ||
| if value is obj: | ||
| return mod_name, var_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for module import.
- module = importlib.import_module(mod_name)
- for var_name, value in vars(module).items():
- if value is obj:
- return mod_name, var_name
+ try:
+ module = importlib.import_module(mod_name)
+ for var_name, value in vars(module).items():
+ if value is obj:
+ return mod_name, var_name
+ except ImportError:
+ return mod_name, None📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| module = importlib.import_module(mod_name) | |
| for var_name, value in vars(module).items(): | |
| if value is obj: | |
| return mod_name, var_name | |
| try: | |
| module = importlib.import_module(mod_name) | |
| for var_name, value in vars(module).items(): | |
| if value is obj: | |
| return mod_name, var_name | |
| except ImportError: | |
| return mod_name, None |
| def get_mod_and_var_name_from_gc(obj, mod_prefix): | ||
| # Find the variable name within the module | ||
| mod_name = get_mod_name_from_gc(obj, mod_prefix) | ||
| """Get the variable name that points to the obj in the module""" | ||
| if not mod_name: | ||
| return None | ||
|
|
||
| module = importlib.import_module(mod_name) | ||
| for var_name, value in vars(module).items(): | ||
| if value is obj: | ||
| return mod_name, var_name | ||
|
|
||
| return mod_name, None | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add type hints and proper docstring.
-def get_mod_and_var_name_from_gc(obj, mod_prefix):
- # Find the variable name within the module
- mod_name = get_mod_name_from_gc(obj, mod_prefix)
- """Get the variable name that points to the obj in the module"""
+def get_mod_and_var_name_from_gc(obj: object, mod_prefix: str) -> tuple[str | None, str | None]:
+ """Get the module and variable name that points to the object.
+
+ Args:
+ obj: Object to find in garbage collector
+ mod_prefix: Module prefix to filter by
+
+ Returns:
+ Tuple of (module_name, variable_name) or (module_name, None) if variable not found
+ """
+ mod_name = get_mod_name_from_gc(obj, mod_prefix)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_mod_and_var_name_from_gc(obj, mod_prefix): | |
| # Find the variable name within the module | |
| mod_name = get_mod_name_from_gc(obj, mod_prefix) | |
| """Get the variable name that points to the obj in the module""" | |
| if not mod_name: | |
| return None | |
| module = importlib.import_module(mod_name) | |
| for var_name, value in vars(module).items(): | |
| if value is obj: | |
| return mod_name, var_name | |
| return mod_name, None | |
| def get_mod_and_var_name_from_gc(obj: object, mod_prefix: str) -> tuple[str | None, str | None]: | |
| """Get the module and variable name that points to the object. | |
| Args: | |
| obj: Object to find in garbage collector | |
| mod_prefix: Module prefix to filter by | |
| Returns: | |
| Tuple of (module_name, variable_name) or (module_name, None) if variable not found | |
| """ | |
| mod_name = get_mod_name_from_gc(obj, mod_prefix) | |
| if not mod_name: | |
| return None | |
| module = importlib.import_module(mod_name) | |
| for var_name, value in vars(module).items(): | |
| if value is obj: | |
| return mod_name, var_name | |
| return mod_name, None |
03443f8 to
0ddaabc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (9)
cli_readme.md (1)
6-8: Improve readability and clarify environment terminology-If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. +If you run `backfill` for the `GroupBy` and then call `plan` again, you would see no change to your `GroupBy` `backfill` plan. However, the column addition would still be reflected in your `upload` plan. -`plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment +`plan` always runs against a particular environment. By default, it runs against your development environment. However, it can also be set to `prod` to display the plan of your local changes against the production environment🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
api/py/ai/chronon/repo/extract_objects.py (1)
43-47: Complete the docstring""" - Recursively consumes a folder, and constructs a map of - object qualifier to StagingQuery, GroupBy, or Join + Recursively consumes a folder and constructs a map of object qualifiers to their implementations. + + Args: + full_path: Path to the folder to process + target_file: Specific file to track errors for + cls: Type of objects to extract (StagingQuery, GroupBy, or Join) + + Returns: + Tuple of (results dict, errors dict, target file error) """api/py/ai/chronon/repo/runner.py (3)
27-28: Chain exceptions for better debugging- raise ValueError(f"Can only run one of {valid_types}, got {type(obj).__name__}") + raise ValueError(f"Can only run one of {valid_types}, got {type(obj).__name__}") from None🧰 Tools
🪛 Ruff (0.8.2)
28-28: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
94-95: Implement backfill functionalityThe TODO comment indicates missing implementation.
Would you like me to help implement the backfill functionality or create a GitHub issue to track this task?
98-146: Well-documented placeholder functionsDocstrings are comprehensive. Functions need implementation.
Would you like me to help implement these functions or create GitHub issues to track their implementation?
api/py/ai/chronon/repo/compilev2.py (4)
38-38: Remove unused constant.The
DEFAULT_TEAM_NAMEconstant is not used anywhere in the code.-DEFAULT_TEAM_NAME = "default"
55-58: Use ternary operator for brevity.- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
55-58: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
137-146: Track TODO for ZiplineHub integration.The TODO comment indicates missing ZiplineHub integration for lineage and schema display.
Would you like me to create a GitHub issue to track this task?
257-257: Remove unused variable.The
class_namevariable is assigned but never used.- class_name = obj_class.__name__🧰 Tools
🪛 Ruff (0.8.2)
257-257: Local variable
class_nameis assigned to but never usedRemove assignment to unused variable
class_name(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (6)
api/py/ai/chronon/repo/__init__.py(1 hunks)api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/repo/extract_objects.py(3 hunks)api/py/ai/chronon/repo/runner.py(1 hunks)api/py/ai/chronon/utils.py(1 hunks)cli_readme.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- api/py/ai/chronon/repo/init.py
- api/py/ai/chronon/utils.py
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/runner.py
28-28: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
api/py/ai/chronon/repo/compilev2.py
55-58: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
70-70: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
184-184: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
257-257: Local variable class_name is assigned to but never used
Remove assignment to unused variable class_name
(F841)
🔇 Additional comments (6)
cli_readme.md (3)
41-41: 🛠️ Refactor suggestionAdd streaming job deployment requirement
-After calling `Upload` for any `Entity`, you can then call `fetch` to get values. +After calling `Upload` for any `Entity` and ensuring any required streaming jobs are deployed, you can then call `fetch` to get values.Likely invalid or redundant comment.
60-60:⚠️ Potential issueFix incorrect command in usage section
-### Usage: `zipline upload ENTITY_ID [OPTIONS]` +### Usage: `zipline fetch ENTITY_ID [OPTIONS]`Likely invalid or redundant comment.
🧰 Tools
🪛 Markdownlint (0.37.0)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time(MD001, heading-increment)
50-50:⚠️ Potential issueComplete the --ds option description
---ds The date to use for the batch upload. Defaults to +--ds The date to use for the batch upload. Defaults to two days prior to today (UTC)Likely invalid or redundant comment.
api/py/ai/chronon/repo/extract_objects.py (2)
Line range hint
25-41: LGTM!Function simplification improves clarity.
Line range hint
82-101: LGTM!Function simplification improves clarity.
api/py/ai/chronon/repo/compilev2.py (1)
45-47:⚠️ Potential issueAdd force_overwrite parameter.
The function uses force_overwrite in _write_obj but it's not in the signature.
-def extract_and_convert( - chronon_root, target_object, target_object_file, debug=False, output_root=None -): +def extract_and_convert( + chronon_root, target_object, target_object_file, debug=False, output_root=None, force_overwrite=False +):Likely invalid or redundant comment.
| --start start date for which you want data backfilled for this entity. Defaults to the configured start date. | ||
| --end end date for which you want data backfilled for this entity. Defaults to today - 2 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add date format specifications
---start start date for which you want data backfilled for this entity. Defaults to the configured start date.
---end end date for which you want data backfilled for this entity. Defaults to today - 2 days.
+--start start date (YYYY-MM-DD format) for which you want data backfilled. Defaults to the configured start date.
+--end end date (YYYY-MM-DD format) for which you want data backfilled. Defaults to today - 2 days.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| --start start date for which you want data backfilled for this entity. Defaults to the configured start date. | |
| --end end date for which you want data backfilled for this entity. Defaults to today - 2 days. | |
| --start start date (YYYY-MM-DD format) for which you want data backfilled. Defaults to the configured start date. | |
| --end end date (YYYY-MM-DD format) for which you want data backfilled. Defaults to today - 2 days. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
api/py/ai/chronon/repo/runner.py (2)
64-81: Consider normalizing pathsAdd
os.path.normpathfor consistent path handling across platforms.- current_path = os.path.dirname(os.path.abspath(filepath)) + current_path = os.path.dirname(os.path.normpath(os.path.abspath(filepath)))
104-104: Track TODOs systematicallyConsider creating GitHub issues for tracking these TODOs.
Would you like me to create GitHub issues for implementing these functions?
Also applies to: 124-124, 141-141, 160-160
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
api/py/ai/chronon/repo/runner.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/runner.py
27-29: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: no_spark_scala_tests
🔇 Additional comments (2)
api/py/ai/chronon/repo/runner.py (2)
10-62: LGTM! Clean dataclass implementationWell-structured configuration encapsulation with thorough validation.
🧰 Tools
🪛 Ruff (0.8.2)
27-29: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
101-102: Add error handling for compilationVerify compilation success before proceeding.
- extract_and_convert(config_details.chronon_root, zipline_obj, config_details.module) + try: + extract_and_convert(config_details.chronon_root, zipline_obj, config_details.module) + except Exception as e: + raise ValueError(f"Failed to compile object: {e}") from e + + if not os.path.exists(config_details.compiled_file): + raise ValueError("Compilation failed: compiled file not found")
| except StopIteration: | ||
| valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()] | ||
| raise ValueError( | ||
| f"Can only run one of {valid_types}, got {type(obj).__name__}" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add exception chaining
Add from err to preserve traceback:
- except StopIteration:
+ except StopIteration as err:
valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()]
- raise ValueError(
+ raise ValueError(
f"Can only run one of {valid_types}, got {type(obj).__name__}"
- )
+ ) from err📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except StopIteration: | |
| valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()] | |
| raise ValueError( | |
| f"Can only run one of {valid_types}, got {type(obj).__name__}" | |
| ) | |
| except StopIteration as err: | |
| valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()] | |
| raise ValueError( | |
| f"Can only run one of {valid_types}, got {type(obj).__name__}" | |
| ) from err |
🧰 Tools
🪛 Ruff (0.8.2)
27-29: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
api/py/ai/chronon/repo/runner.py (1)
107-160: Add type hints to placeholder functionsAdd return type annotations to improve code clarity.
-def upload(zipline_obj, date, force_recompute=False, plan=False): +def upload(zipline_obj, date, force_recompute=False, plan=False) -> None: -def stream(zipline_obj): +def stream(zipline_obj) -> None: -def info(zipline_obj, branch=None): +def info(zipline_obj, branch=None) -> None:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (2)
api/py/ai/chronon/repo/runner.py(1 hunks)api/py/ai/chronon/utils.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- api/py/ai/chronon/utils.py
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/runner.py
27-29: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: no_spark_scala_tests
🔇 Additional comments (1)
api/py/ai/chronon/repo/runner.py (1)
64-81: LGTM!Well-structured helper function with clear error handling.
| @dataclass | ||
| class ConfigDetails: | ||
| module: str | ||
| variable: str | ||
| file: str | ||
| compiled_file: str | ||
| chronon_root: str | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Simplify ConfigDetails implementation
- Remove manual
__init__since@dataclassauto-generates it - Add exception chaining
- Simplify error messages
@dataclass
class ConfigDetails:
module: str
variable: str
file: str
compiled_file: str
chronon_root: str
- def __init__(self, obj):
+ @classmethod
+ def from_obj(cls, obj):
try:
obj_type = next(
k for k, v in FOLDER_NAME_TO_CLASS.items() if isinstance(obj, v)
)
- except StopIteration:
+ except StopIteration as err:
valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()]
raise ValueError(
f"Can only run one of {valid_types}, got {type(obj).__name__}"
- )
+ ) from err
module, variable = get_mod_and_var_name_from_gc(obj, obj_type)
if module is None or variable is None:
raise ValueError("Could not determine module and variable name for object")
module_obj = importlib.import_module(module)
file = inspect.getmodule(module_obj).__file__
if not file:
- raise ValueError(
- f"""
- Could not determine file location for module {module}, {variable}.\n
- Runner currently only supports working on files saved within a valid Chronon
- root directory.\n Make sure you have your Zipline python files within the
- right directory, and then you can import them to your desired runtime.
- """
- )
+ raise ValueError(f"Could not determine file location for module {module}, {variable}. Ensure file is within Chronon root directory.")
path_parts = module.split(".")
if path_parts[0] != obj_type:
raise ValueError(f"Expected module path to start with '{obj_type}', got {module}")
chronon_root = get_chronon_root(file)
team = path_parts[1]
path = ".".join(path_parts[2:])
compiled_file = f"{chronon_root}/{OUTPUT_ROOT}/{obj_type}/{team}/{path}.{variable}"
+ return cls(module=module, variable=variable, file=file,
+ compiled_file=compiled_file, chronon_root=chronon_root)Also applies to: 18-62
api/py/ai/chronon/repo/runner.py
Outdated
| def backfill(zipline_obj, start_date, end_date, force_recompute=False, plan=False): | ||
| """ | ||
| Backfills a Chronon object for a specified date range. | ||
| Args: | ||
| zipline_obj: The Chronon object (GroupBy, Join, StagingQuery) to backfill | ||
| start_date: Start date for the backfill period | ||
| end_date: End date for the backfill period | ||
| force_recompute: If True, recomputes data even if it already exists (default: False) | ||
| plan: If True, only shows execution plan without running backfill (default: False) | ||
| Returns: | ||
| None | ||
| Raises: | ||
| ValueError: If the object cannot be compiled or backfilled | ||
| """ | ||
| config_details = ConfigDetails(zipline_obj) | ||
| # module_file, compiled_file, chronon_root = get_target_compiled_file(zipline_obj) | ||
| extract_and_convert(config_details.chronon_root, zipline_obj, config_details.module) | ||
| # TODO: actually kick off backfill for compiled_file | ||
| print("\n\n TODO -- Implement \n\n") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add compilation verification and update ConfigDetails usage
- Verify compilation success
- Update to use new factory method
def backfill(zipline_obj, start_date, end_date, force_recompute=False, plan=False):
- config_details = ConfigDetails(zipline_obj)
+ config_details = ConfigDetails.from_obj(zipline_obj)
- extract_and_convert(config_details.chronon_root, zipline_obj, config_details.module)
+ if not extract_and_convert(config_details.chronon_root, zipline_obj, config_details.module):
+ raise ValueError("Compilation failed")
+ if not os.path.exists(config_details.compiled_file):
+ raise ValueError("Compiled file not found")Committable suggestion skipped: line range outside the PR's diff.
bb6c942 to
a88e5fc
Compare
Co-authored-by: Nikhil Simha <[email protected]>
fc7909f to
aa9830b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
api/py/ai/chronon/repo/runner.py (3)
28-32:⚠️ Potential issueAdd exception chaining to preserve traceback
Include the original exception when raising a new one.
Apply this diff:
- except StopIteration: + except StopIteration as err: valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()] - raise ValueError( + raise ValueError( f"Can only run one of {valid_types}, got {type(obj).__name__}" - ) + ) from err🧰 Tools
🪛 Ruff (0.8.2)
30-32: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
21-66: 🛠️ Refactor suggestionRefactor
ConfigDetailsusing a factory methodConvert
__init__to a@classmethodfor clarity.Apply this diff:
@dataclass class ConfigDetails: module: str variable: str file: str compiled_file: str chronon_root: str - def __init__(self, obj): + @classmethod + def from_obj(cls, obj): try: obj_type = next( k for k, v in FOLDER_NAME_TO_CLASS.items() if isinstance(obj, v) ) except StopIteration as err: valid_types = [cls.__name__ for cls in FOLDER_NAME_TO_CLASS.values()] raise ValueError( f"Can only run one of {valid_types}, got {type(obj).__name__}" ) from err # Rest of the code... + return cls( + module=module, + variable=variable, + file=file, + compiled_file=compiled_file, + chronon_root=chronon_root + )🧰 Tools
🪛 Ruff (0.8.2)
30-32: Within an
exceptclause, raise exceptions withraise ... from errorraise ... from Noneto distinguish them from errors in exception handling(B904)
144-160:⚠️ Potential issueUpdate docstring to match parameters
The
backfillmethod's docstring referenceszipline_obj, which isn't a parameter.
🧹 Nitpick comments (8)
api/py/ai/chronon/repo/extract_objects.py (3)
Line range hint
25-41: Fix incomplete docstring.Docstring ends mid-sentence and lacks return type.
def from_folder(full_path: str, cls: type, log_level=logging.INFO): """ - Recursively consumes a folder, and constructs a map - Creates a map of object qualifier to + Recursively processes Python files in a folder. + + Args: + full_path: Directory path to process + cls: Type of objects to extract + log_level: Logging level + + Returns: + dict: Map of object qualifiers to extracted objects """
44-66: Add return type hints.Missing return type annotation for clarity.
-def from_folderV2(full_path: str, target_file: str, cls: type): +def from_folderV2(full_path: str, target_file: str, cls: type) -> tuple[dict, dict[str, Exception], Exception | None]:
83-86: Use f-strings for better readability.Replace .format() with f-string.
- logger.debug("Loading objects of type {cls} from {file_path}".format(**locals())) + logger.debug(f"Loading objects of type {cls} from {file_path}")cli_readme.md (3)
6-8: Fix usage of 'however' and punctuationSeparate 'however' into new sentences and add commas.
Apply this diff:
-...plan, however, there would still... +...plan. However, there would still... -...on, however, it can also be... +...on. However, it can also be...🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
12-12: Correct heading levelChange heading from H3 to H2 for consistency.
Apply this diff:
-### Usage: `zipline plan [OPTIONS]` +## Usage: `zipline plan [OPTIONS]`🧰 Tools
🪛 Markdownlint (0.37.0)
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time(MD001, heading-increment)
15-15: Specify language in code blockAdd language specifier for syntax highlighting.
Apply this diff:
-``` +```bash🧰 Tools
🪛 Markdownlint (0.37.0)
15-15: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
api/py/ai/chronon/repo/compilev2.py (2)
56-59: Simplify log level assignmentUse a ternary operator for brevity.
Apply this diff:
- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
56-59: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
71-71: Simplify dictionary key checkUse
key in dictinstead ofkey in dict.keys().Apply this diff:
- d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS.keys() + d for d in os.listdir(chronon_root) if d in FOLDER_NAME_TO_CLASS🧰 Tools
🪛 Ruff (0.8.2)
71-71: Use
key in dictinstead ofkey in dict.keys()Remove
.keys()(SIM118)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (12)
api/py/ai/chronon/group_by.py(2 hunks)api/py/ai/chronon/join.py(3 hunks)api/py/ai/chronon/repo/__init__.py(1 hunks)api/py/ai/chronon/repo/compile.py(1 hunks)api/py/ai/chronon/repo/compilev2.py(1 hunks)api/py/ai/chronon/repo/extract_objects.py(3 hunks)api/py/ai/chronon/repo/hub_uploader.py(1 hunks)api/py/ai/chronon/repo/runner.py(1 hunks)api/py/ai/chronon/repo/serializer.py(2 hunks)api/py/ai/chronon/utils.py(1 hunks)api/thrift/hub.thrift(1 hunks)cli_readme.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- api/py/ai/chronon/repo/init.py
- api/py/ai/chronon/repo/compile.py
- api/thrift/hub.thrift
- api/py/ai/chronon/repo/serializer.py
- api/py/ai/chronon/utils.py
- api/py/ai/chronon/group_by.py
- api/py/ai/chronon/repo/hub_uploader.py
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
43-43: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
60-60: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
49-49: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
66-66: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/runner.py
30-32: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
api/py/ai/chronon/repo/compilev2.py
56-59: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
71-71: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
198-198: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: no_spark_scala_tests
🔇 Additional comments (5)
api/py/ai/chronon/repo/extract_objects.py (1)
Line range hint
83-102: Verify module import error handling.Check if import errors are properly caught and logged.
✅ Verification successful
Import errors are properly handled at caller level
Error handling exists in the calling functions with appropriate logging and error preservation. No changes needed.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for potential import error handling gaps rg -n "importlib\.(machinery|util)" --type pyLength of output: 186
Script:
#!/bin/bash # Check for error handling around imports rg -B 2 -A 2 "try.*:|except.*:|import_module" api/py/ai/chronon/repo/extract_objects.pyLength of output: 1132
api/py/ai/chronon/repo/compilev2.py (1)
264-265: Address commented-out codeDecide whether to implement or remove the
safe_to_overwritecheck.api/py/ai/chronon/join.py (3)
27-27: LGTM!Clean import of MethodType for dynamic method binding.
Line range hint
645-656: LGTM!Join object creation is well-structured with appropriate parameters.
657-664: Renameuploadtodeployfor consistency.Previous review comment suggested renaming
uploadtodeployfor consistency.Apply this diff:
- join.upload = MethodType(deploy, join) + join.deploy = MethodType(deploy, join)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
api/py/ai/chronon/repo/compilev2.py (1)
264-265:⚠️ Potential issueImplement the safe_to_overwrite check.
The commented code represents a critical safety check. Either implement it or document why it's safe to remove.
🧹 Nitpick comments (2)
api/py/ai/chronon/repo/compilev2.py (2)
56-59: Simplify log_level assignment.Use ternary operator for better readability.
- if debug: - log_level = logging.DEBUG - else: - log_level = logging.INFO + log_level = logging.DEBUG if debug else logging.INFO🧰 Tools
🪛 Ruff (0.8.2)
56-59: Use ternary operator
log_level = logging.DEBUG if debug else logging.INFOinstead ofif-else-blockReplace
if-else-block withlog_level = logging.DEBUG if debug else logging.INFO(SIM108)
54-54: Implement ZiplineHub integration.The TODO comments indicate missing ZiplineHub integration for lineage and schema functionality.
Would you like me to help implement the ZiplineHub integration or create an issue to track this task?
Also applies to: 143-153
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
api/py/ai/chronon/repo/compilev2.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/py/ai/chronon/repo/compilev2.py
56-59: Use ternary operator log_level = logging.DEBUG if debug else logging.INFO instead of if-else-block
Replace if-else-block with log_level = logging.DEBUG if debug else logging.INFO
(SIM108)
71-71: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
198-198: Use key in dict instead of key in dict.keys()
Remove .keys()
(SIM118)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: no_spark_scala_tests
🔇 Additional comments (2)
api/py/ai/chronon/repo/compilev2.py (2)
1-40: LGTM!The license, imports, and constants are properly defined.
46-48:⚠️ Potential issueAdd force_overwrite parameter.
Based on usage in _write_obj, the parameter is required.
def extract_and_convert( - chronon_root, target_object, target_object_file, debug=False, output_root=None + chronon_root, target_object, target_object_file, debug=False, output_root=None, force_overwrite=False ):Likely invalid or redundant comment.
f53eeb6 to
12f5d47
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
api/py/ai/chronon/group_by.py (1)
605-613: Consider using a helper function for method binding.The method binding implementation is correct but could be more maintainable.
- # Import locally to avoid circular dependency issue - from ai.chronon.repo.runner import backfill, deploy, info - - # Attach functions directly to group_by - group_by.backfill = MethodType(backfill, group_by) - group_by.deploy = MethodType(deploy, group_by) - group_by.info = MethodType(info, group_by) + def _bind_methods(obj): + # Import locally to avoid circular dependency issue + from ai.chronon.repo.runner import backfill, deploy, info + + methods = {'backfill': backfill, 'deploy': deploy, 'info': info} + for name, func in methods.items(): + setattr(obj, name, MethodType(func, obj)) + + _bind_methods(group_by)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (3)
api/py/ai/chronon/group_by.py(2 hunks)api/py/ai/chronon/join.py(4 hunks)api/py/ai/chronon/repo/compile.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- api/py/ai/chronon/repo/compile.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: no_spark_scala_tests
🔇 Additional comments (5)
api/py/ai/chronon/group_by.py (1)
25-25: LGTM!The import of
MethodTypeis correctly placed with other imports.api/py/ai/chronon/join.py (4)
27-27: LGTM!The import is necessary for the method bindings.
73-79: LGTM!The condition correctly filters out method references from garbage collector.
Line range hint
653-664: LGTM!The modification allows method binding before returning the Join instance.
665-672: Use consistent method naming:deployinstead ofupload.Rename
join.uploadtojoin.deployfor consistency.- join.upload = MethodType(deploy, join) + join.deploy = MethodType(deploy, join)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cli_readme.md (1)
74-74: Fix typo.-Provides information about a given `Entity`, including upstream/downstream lineage and schema informatio. +Provides information about a given `Entity`, including upstream/downstream lineage and schema information.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (1)
cli_readme.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
cli_readme.md
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to your GroupBy backfill plan, however, there would still be a column addition ...
(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...
(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set to prod in which c...
(HOWEVER_SENTENCE)
[grammar] ~21-~21: Did you mean “these”?
Context: ...or the specified entity and date range. This produces computed values for the `Entit...
(THIS_NNS_VB)
[style] ~39-~39: Consider removing “of” to be more concise
Context: .... If run for a Join, it will run for all of the GroupBy`s included in that join, as we...
(ALL_OF_THE)
🪛 Markdownlint (0.37.0)
cli_readme.md
12-12: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
23-23: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
45-45: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
62-62: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
76-76: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time
(MD001, heading-increment)
15-15: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
29-29: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
51-51: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
68-68: null
Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: no_spark_scala_tests
🔇 Additional comments (5)
cli_readme.md (5)
6-8: Fix grammatical issues and clarify environment terminology.-For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan, however, there would still be a column addition reflected in your `upload` plan. +For example, if you add a column to a `GroupBy` and call `plan`, you'll see that change reflected as a column addition. If you then run `backfill` for the `GroupBy`, then call `plan` again, you would see no change to your `GroupBy` `backfill` plan. However, there would still be a column addition reflected in your `upload` plan. -`plan` is also always run against a particular environment. By default it is the branch which you are authoring on, however, it can also be set to `prod` in which case it will display the plan of your local changes against the prod environment +`plan` is always run against a particular environment. By default, it runs against your current development branch. However, it can also be set to `prod`, in which case it will display the plan of your local changes against the production environment🧰 Tools
🪛 LanguageTool
[typographical] ~6-~6: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...change to yourGroupBybackfillplan, however, there would still be a column addition ...(HOWEVER_SENTENCE)
[uncategorized] ~8-~8: Did you mean: “By default,”?
Context: ...s run against a particular environment. By default it is the branch which you are authorin...(BY_DEFAULT_COMMA)
[typographical] ~8-~8: The word “however” is an adverb that can’t be used like a conjunction, and therefore needs to be separated from the sentence.
Context: ...is the branch which you are authoring on, however, it can also be set toprodin which c...(HOWEVER_SENTENCE)
15-17: Add language specifier to code block.-``` +```bash --branch the branch against which to compare the local state. Can be set to `prod` or a different `branch_id`. Defaults to the current branch if one is set, else `prod`.<details> <summary>🧰 Tools</summary> <details> <summary>🪛 Markdownlint (0.37.0)</summary> 15-15: null Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> --- `31-32`: **Add date format specifications.** ```diff ---start start date for which you want data backfilled for this entity. Defaults to the configured start date. ---end end date for which you want data backfilled for this entity. Defaults to today - 2 days. +--start start date (YYYY-MM-DD format) for which you want data backfilled. Defaults to the configured start date. +--end end date (YYYY-MM-DD format) for which you want data backfilled. Defaults to today - 2 days.
52-52: Complete the --ds option description.---ds The date to use for the batch upload. Defaults to +--ds The date to use for the batch upload. Defaults to two days prior to today (UTC)
62-70: Fix command usage and add examples.-### Usage: `zipline upload ENTITY_ID [OPTIONS]` +### Usage: `zipline fetch ENTITY_ID [OPTIONS]` +Example: +```bash +zipline fetch user_metrics --keys '{"user_id": "12345"}' +```🧰 Tools
🪛 Markdownlint (0.37.0)
62-62: Expected: h2; Actual: h3
Heading levels should only increment by one level at a time(MD001, heading-increment)
68-68: null
Fenced code blocks should have a language specified(MD040, fenced-code-language)
nikhil-zlai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the hashing comment:
https://github.com/zipline-ai/chronon/pull/113/files#r1912552717
## Summary - Initial implementation of Python `compile` -- compiles the whole repo - Placeholder implementation of `backfill` -- next step will be to wire up to the orchestrator for implementation - Documentation for the broader user API surface area and how it should behave (should maybe move this to pydocs) Currently, the interaction looks like this: <img width="729" alt="image" src="https://github.com/user-attachments/assets/a52cf582-4394-449a-8567-32c0f108d2ed" /> Next steps: - Connect with ZiplineHub for lineage, and backfill execution ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Expanded CLI documentation with detailed descriptions of `plan`, `backfill`, `deploy`, and `fetch` commands. - Added comprehensive usage instructions and examples for each command. - **New Features** - Enhanced repository management functions for Chronon objects. - Added methods for compiling, uploading, and managing configuration details. - Introduced new utility functions for module and variable name retrieval. - Added functionality to compute and upload differences in a local repository. - New function to convert JSON to Thrift binary format. - **Improvements** - Streamlined function signatures in extraction and compilation modules. - Added error handling and logging improvements. - Enhanced object serialization and diff computation capabilities. - Dynamic method bindings for `backfill`, `deploy`, and `info` functions in `GroupBy` and `Join`. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Nikhil Simha <[email protected]>
## Summary - Initial implementation of Python `compile` -- compiles the whole repo - Placeholder implementation of `backfill` -- next step will be to wire up to the orchestrator for implementation - Documentation for the broader user API surface area and how it should behave (should maybe move this to pydocs) Currently, the interaction looks like this: <img width="729" alt="image" src="https://github.com/user-attachments/assets/a52cf582-4394-449a-8567-32c0f108d2ed" /> Next steps: - Connect with ZiplineHub for lineage, and backfill execution ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Expanded CLI documentation with detailed descriptions of `plan`, `backfill`, `deploy`, and `fetch` commands. - Added comprehensive usage instructions and examples for each command. - **New Features** - Enhanced repository management functions for Chronon objects. - Added methods for compiling, uploading, and managing configuration details. - Introduced new utility functions for module and variable name retrieval. - Added functionality to compute and upload differences in a local repository. - New function to convert JSON to Thrift binary format. - **Improvements** - Streamlined function signatures in extraction and compilation modules. - Added error handling and logging improvements. - Enhanced object serialization and diff computation capabilities. - Dynamic method bindings for `backfill`, `deploy`, and `info` functions in `GroupBy` and `Join`. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Nikhil Simha <[email protected]>
## Summary - Initial implementation of Python `compile` -- compiles the whole repo - Placeholder implementation of `backfill` -- next step will be to wire up to the orchestrator for implementation - Documentation for the broader user API surface area and how it should behave (should maybe move this to pydocs) Currently, the interaction looks like this: <img width="729" alt="image" src="https://github.com/user-attachments/assets/a52cf582-4394-449a-8567-32c0f108d2ed" /> Next steps: - Connect with ZiplineHub for lineage, and backfill execution ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Expanded CLI documentation with detailed descriptions of `plan`, `backfill`, `deploy`, and `fetch` commands. - Added comprehensive usage instructions and examples for each command. - **New Features** - Enhanced repository management functions for Chronon objects. - Added methods for compiling, uploading, and managing configuration details. - Introduced new utility functions for module and variable name retrieval. - Added functionality to compute and upload differences in a local repository. - New function to convert JSON to Thrift binary format. - **Improvements** - Streamlined function signatures in extraction and compilation modules. - Added error handling and logging improvements. - Enhanced object serialization and diff computation capabilities. - Dynamic method bindings for `backfill`, `deploy`, and `info` functions in `GroupBy` and `Join`. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Nikhil Simha <[email protected]>
## Summary - Initial implementation of Python `compile` -- compiles the whole repo - Placeholder implementation of `backfill` -- next step will be to wire up to the orchestrator for implementation - Documentation for the broader user API surface area and how it should behave (should maybe move this to pydocs) Currently, the interaction looks like this: <img width="729" alt="image" src="https://github.com/user-attachments/assets/a52cf582-4394-449a-8567-32c0f108d2ed" /> Next steps: - Connect with ZiplineHub for lineage, and backfill execution ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Expanded CLI documentation with detailed descriptions of `plan`, `backfill`, `deploy`, and `fetch` commands. - Added comprehensive usage instructions and examples for each command. - **New Features** - Enhanced repository management functions for Chronon objects. - Added methods for compiling, uploading, and managing configuration details. - Introduced new utility functions for module and variable name retrieval. - Added functionality to compute and upload differences in a local repository. - New function to convert JSON to Thrift binary format. - **Improvements** - Streamlined function signatures in extraction and compilation modules. - Added error handling and logging improvements. - Enhanced object serialization and diff computation capabilities. - Dynamic method bindings for `backfill`, `deploy`, and `info` functions in `GroupBy` and `Join`. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Nikhil Simha <[email protected]>
## Summary - Initial implementation of Python `compile` -- compiles the whole repo - Placeholder implementation of `baour clientsfill` -- next step will be to wire up to the orchestrator for implementation - Documentation for the broader user API surface area and how it should behave (should maybe move this to pydocs) Currently, the interaction looks like this: <img width="729" alt="image" src="https://github.com/user-attachments/assets/a52cf582-4394-449a-8567-32c0f108d2ed" /> Next steps: - Connect with ZiplineHub for lineage, and baour clientsfill execution ## Cheour clientslist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [x] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Expanded CLI documentation with detailed descriptions of `plan`, `baour clientsfill`, `deploy`, and `fetch` commands. - Added comprehensive usage instructions and examples for each command. - **New Features** - Enhanced repository management functions for Chronon objects. - Added methods for compiling, uploading, and managing configuration details. - Introduced new utility functions for module and variable name retrieval. - Added functionality to compute and upload differences in a local repository. - New function to convert JSON to Thrift binary format. - **Improvements** - Streamlined function signatures in extraction and compilation modules. - Added error handling and logging improvements. - Enhanced object serialization and diff computation capabilities. - Dynamic method bindings for `baour clientsfill`, `deploy`, and `info` functions in `GroupBy` and `Join`. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: Nikhil Simha <[email protected]>
Summary
compile-- compiles the whole repobackfill-- next step will be to wire up to the orchestrator for implementationCurrently, the interaction looks like this:
Next steps:
Checklist
Summary by CodeRabbit
Documentation
plan,backfill,deploy, andfetchcommands.New Features
Improvements
backfill,deploy, andinfofunctions inGroupByandJoin.