Skip to content

Conversation

@varant-zlai
Copy link
Collaborator

@varant-zlai varant-zlai commented Jan 16, 2025

Summary

WIP input/out schema for the Job tracker API on ZiplineHub

Checklist

  • Added Unit Tests
  • Covered by existing CI
  • Integration tested
  • Documentation update

Summary by CodeRabbit

  • New Features

    • Introduced a comprehensive job tracking system with support for tracking job execution, status, and metadata.
    • Added a new API endpoint for handling job type requests.
    • Defined structures for job tracking requests and responses, including job metadata.
    • Introduced enumerations for job processing direction, execution modes, and statuses.
    • Added a new handler for processing job tracking requests.
  • Bug Fixes

    • Corrected a typographical error in a comment within the orchestration Thrift file.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2025

Walkthrough

The pull request introduces a new Thrift file hub.thrift, defining data structures and enumerations for job tracking and lineage management. Key components include LineageRequest, LineageResponse, JobTrackerRequest, and JobTrackerResponse. A minor typographical error was corrected in orchestration.thrift. The API functionality is enhanced with a new route in HubVerticle and a new method in JobTracker.

Changes

File Change Summary
api/thrift/hub.thrift Added structs and enums for job tracking and lineage: LineageRequest, LineageResponse, JobTrackerRequest, JobTrackerResponse, SubmissionsRequest, SubmissionsResponse, Direction, TaskInfo, DateRange, TaskArgs, TaskResources, Mode, Status, and Submission
api/thrift/orchestration.thrift Corrected comment typo from "phsical" to "physical"
hub/src/main/java/ai/chronon/hub/HubVerticle.java Added new route: router.get("/api/v1/:name/job/type/:type") for job type handling
hub/src/main/scala/ai/chronon/hub/handlers/JobTracker.scala Added method: def handle(req: JobTrackerRequest): JobTrackerResponse in JobTracker

Sequence Diagram

sequenceDiagram
    participant Client
    participant HubVerticle
    participant JobTracker
    Client->>HubVerticle: Request Job by Type
    HubVerticle->>JobTracker: JobTrackerRequest
    JobTracker-->>HubVerticle: JobTrackerResponse
    HubVerticle-->>Client: Response with Job Details
Loading

Poem

🌟 New structs and enums, a grand design,
Job tracking magic, all in a line!
From requests to responses, we pave the way,
In the world of Thrift, we joyfully play! 🎉
Code flows like rivers, clear and bright,
Celebrating changes, what a delight! ✨

Warning

Review ran into problems

🔥 Problems

GitHub Actions: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository.

Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between d80c618 and df094c9.

📒 Files selected for processing (1)
  • hub/src/main/scala/ai/chronon/hub/handlers/JobTracker.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hub/src/main/scala/ai/chronon/hub/handlers/JobTracker.scala
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: scala_compile_fmt_fix

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.


// Job corresponds to top-level runs for the Entity, can be either user-run adhoc (via CLI/python API) or scheduled
// We should move this to orchestration.thrift if we keep it
struct Job {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some question here around how to handle overlapping jobs for a given day. Probably just take the latest for each day is good enough for now is my current thought.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
api/thrift/hub.thrift (1)

1-8: Remove extra blank line.

 include "orchestration.thrift"
-
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between d8d1e08 and ad976b7.

📒 Files selected for processing (2)
  • api/thrift/hub.thrift (1 hunks)
  • api/thrift/orchestration.thrift (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • api/thrift/orchestration.thrift
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: no_spark_scala_tests
🔇 Additional comments (4)
api/thrift/hub.thrift (4)

30-35: LGTM! Clean and well-structured response type.


37-41: LGTM! Comprehensive direction options.


62-65: LGTM! Clear mode definitions.


50-60: 🛠️ Refactor suggestion

Address structural concerns.

  1. Decide on struct location (see comment)
  2. Use i64 for dates
  3. Document null handling for endTime
-    2: required string startDate
-    3: required string endDate
+    2: required i64 startTimestamp
+    3: required i64 endTimestamp

Comment on lines 23 to 36
struct JobTrackerRequest {
1: required orchestration.NodeKey node
2: required Direction direction
3: optional string startDate // Do we need a way to specify a range, in case of very large jobs?
4: optional string endDate
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use i64 timestamps instead of string dates.

String dates are error-prone. Consider:

-   3: optional string startDate
-   4: optional string endDate
+   3: optional i64 startTimestamp
+   4: optional i64 endTimestamp
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
struct JobTrackerRequest {
1: required orchestration.NodeKey node
2: required Direction direction
3: optional string startDate // Do we need a way to specify a range, in case of very large jobs?
4: optional string endDate
}
struct JobTrackerRequest {
1: required orchestration.NodeKey node
2: required Direction direction
3: optional i64 startTimestamp
4: optional i64 endTimestamp
}

Comment on lines 43 to 81
struct StepRange {
1: required string start
2: required string end
3: required Status status
4: required string logLocation // Or maybe download logs is a different API?
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refine date handling and clarify log strategy.

  1. Use timestamps for consistency
  2. Resolve the log API question in comment
-   1: required string start
-   2: required string end
+   1: required i64 startTimestamp
+   2: required i64 endTimestamp
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
struct StepRange {
1: required string start
2: required string end
3: required Status status
4: required string logLocation // Or maybe download logs is a different API?
}
struct StepRange {
1: required i64 startTimestamp
2: required i64 endTimestamp
3: required Status status
4: required string logLocation // Or maybe download logs is a different API?
}

@ken-zlai ken-zlai requested a review from nikhil-zlai January 17, 2025 21:07
@varant-zlai varant-zlai force-pushed the vz--create_backfill_tracker_thrift_apis branch from ad976b7 to ea48018 Compare January 17, 2025 22:18
1: optional string name
2: optional string type // physical type (limited to backfill or batch upload)
3: required Direction direction
4: optional string startDate // Do we need a way to specify a range, in case of very large jobs?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start these from 10 - so we can insert non time related options in the request later.

*/
struct JobTrackerRequest {
1: optional string name
2: optional string type // physical type (limited to backfill or batch upload)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use enum

Comment on lines 47 to 48
3: required Status status
4: required string logLocation // Or maybe download logs is a different API?
Copy link
Contributor

@nikhil-zlai nikhil-zlai Jan 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we have a struct

NodeInfo {
  1: optional string name
  2: optional string version  
  3: optional string physicalType
  
  10: optional list<TaskInfo> tasks // very old job infos might be be pruned 
  11: optional List<DateRange> availablePartitions // specific to batch nodes
}

// general purpose
TaskInfo {
  1: Status status
  2: string logPath
  3: string trackerUrl
  4: TaskArgs args
  5: DateRange dateRange // specific to batch nodes
   
  // time information - useful for gantt / waterfall view
  10: long submittedTs
  11: long startedTs
  12: long finishedTs

  20: string user
  21: string team

   // utilization information
  30: TaskResources allocatedResources
  31: TaskResources utilizedResources
}

TaskArgs {  
  1: list<string> args
  2: map<string, string> env
}

TaskResources {
  1: optional long vcoreSeconds
  2: optional long megaByteSeconds
  3: optional long cumulativeDiskWriteBytes
  4: optional long cumulativeDiskReadBytes
  
  // networking, gpu etc
 }

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
api/thrift/hub.thrift (5)

25-25: 🛠️ Refactor suggestion

Convert type to enum.

Define an enum for job types (backfill/batch upload) instead of using string.


27-28: 🛠️ Refactor suggestion

Use i64 timestamps.

String dates are error-prone. Use timestamps.

-   4: optional string startDate
-   5: optional string endDate
+   4: optional i64 startTimestamp
+   5: optional i64 endTimestamp

45-46: 🛠️ Refactor suggestion

Use i64 timestamps in StepRange.

Maintain consistency with timestamp usage.

-   1: required string startDate
-   2: required string endDate
+   1: required i64 startTimestamp
+   2: required i64 endTimestamp

55-56: 🛠️ Refactor suggestion

Use i64 timestamps in Job struct.

Maintain consistency with timestamp usage.

-   2: required string startDate
-   3: required string endDate
+   2: required i64 startTimestamp
+   3: required i64 endTimestamp

74-75: 🛠️ Refactor suggestion

Add missing comma after UPSTREAM_FAILED.

-   UPSTREAM_FAILED = 5,
+   UPSTREAM_FAILED = 5,
🧹 Nitpick comments (2)
api/thrift/hub.thrift (2)

51-53: Move Job struct to orchestration.thrift.

Consider moving as suggested in the comment for better organization.


63-66: Start Mode enum values from 10.

Follow same convention as Direction enum.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between ea48018 and 403411e.

📒 Files selected for processing (2)
  • api/thrift/hub.thrift (1 hunks)
  • hub/src/main/java/ai/chronon/hub/HubVerticle.java (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hub/src/main/java/ai/chronon/hub/HubVerticle.java
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: no_spark_scala_tests
🔇 Additional comments (4)
api/thrift/hub.thrift (4)

1-6: LGTM! Proper namespace setup and includes.


31-36: LGTM! Clear and well-structured response type.


38-42: Start enum values from 10.

Per team convention, start from 10 to allow for future non-directional options.

 enum Direction {
-   UPSTREAM = 0,
-   DOWNSTREAM = 1,
-   BOTH = 2
+   UPSTREAM = 10,
+   DOWNSTREAM = 11,
+   BOTH = 12
 }

48-48: Clarify log API strategy.

Decide whether log retrieval should be a separate API endpoint.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
api/thrift/hub.thrift (1)

85-88: 🛠️ Refactor suggestion

Use i64 timestamps instead of string dates.

String dates are error-prone and harder to validate.

struct DateRange {
-    1: string startDate
-    2: string endDate
+    1: i64 startTimestamp
+    2: i64 endTimestamp
}
🧹 Nitpick comments (4)
api/thrift/hub.thrift (4)

65-83: Consider making timestamps optional.

Not all tasks will have startedTs and finishedTs immediately.

-    11: required i64 startedTs
-    12: required i64 finishedTs
+    11: optional i64 startedTs
+    12: optional i64 finishedTs

46-46: Document the task ordering strategy.

Add a comment clarifying how overlapping tasks are handled.

-   1: required list<TaskInfo> tasks // Date ranges can overlap for tasks (reruns, retries etc). Need to render latest per day.
+   1: required list<TaskInfo> tasks // Tasks are ordered by submittedTs desc, with latest task per day shown in UI

90-93: Make args and env required fields.

TaskArgs fields should be non-optional to ensure complete task information.

struct TaskArgs {
-  1: list<string> args
-  2: map<string, string> env
+  1: required list<string> args
+  2: required map<string, string> env
}

119-123: Consider adding status field to Submission.

Status would help track submission progress.

struct Submission {
+    22: optional Status status
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 403411e and 9e10d66.

📒 Files selected for processing (2)
  • api/thrift/hub.thrift (1 hunks)
  • api/thrift/orchestration.thrift (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • api/thrift/orchestration.thrift
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: no_spark_scala_tests
  • GitHub Check: scala_compile_fmt_fix
🔇 Additional comments (2)
api/thrift/hub.thrift (2)

1-6: LGTM!

Namespaces and includes are properly defined.


108-116: ⚠️ Potential issue

Add missing comma after UPSTREAM_FAILED.

   RUNNING = 2,
   SUCCESS = 3,
   FAILED = 4,
-   UPSTREAM_FAILED = 5
+   UPSTREAM_FAILED = 5,
   UPSTREAM_MISSING = 6

Likely invalid or redundant comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api/thrift/hub.thrift (1)

90-90: Fix indentation.

Align closing brace with struct definition.

-  }
+ }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between 9e10d66 and 81cc622.

📒 Files selected for processing (1)
  • api/thrift/hub.thrift (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: no_spark_scala_tests
🔇 Additional comments (2)
api/thrift/hub.thrift (2)

76-77: Use i64 timestamps.

Convert date strings to timestamps for consistency and reliability.

-    1: string startDate
-    2: string endDate
+    1: i64 startTimestamp
+    2: i64 endTimestamp

104-105: Add missing comma.

Fix syntax error in enum definition.

   UPSTREAM_FAILED = 5,
   UPSTREAM_MISSING = 6

Comment on lines 63 to 65
10: required i64 submittedTs
11: required i64 startedTs
12: required i64 finishedTs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make timestamps optional.

startedTs and finishedTs won't be available for pending tasks.

-    11: required i64 startedTs
-    12: required i64 finishedTs
+    11: optional i64 startedTs
+    12: optional i64 finishedTs
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
10: required i64 submittedTs
11: required i64 startedTs
12: required i64 finishedTs
10: required i64 submittedTs
11: optional i64 startedTs
12: optional i64 finishedTs

// If the page was accessed via a "submission" link, then we also render the submission range
struct LineageRequest {
1: required string name
2: required string type // physical type (limited to backfill or batch upload)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Convert type to enum.

Replace string type with enum to prevent errors and improve type safety.

-    2: required string type // physical type (limited to backfill or batch upload)
+    2: required PhysicalType type

+enum PhysicalType {
+    BACKFILL = 0,
+    BATCH_UPLOAD = 1
+}

Committable suggestion skipped: line range outside the PR's diff.

2: required orchestration.NodeKey mainNode // Same as the node in the LineageRequest
}

struct JobTrackerRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct JobTrackerRequest {
struct JobTrackerRequest {

Should we just call this a Submission? Then the vocabulary becomes - tracking a submission.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline, sometimes there's no associated submission

Comment on lines 56 to 76
1: required Status status
2: required string logPath
3: required string trackerUrl
4: required TaskArgs args
5: required DateRange dateRange // specific to batch nodes

// time information - useful for gantt / waterfall view
10: required i64 submittedTs
11: required i64 startedTs
12: required i64 finishedTs

20: required string user
21: required string team

// utilization information
30: required TaskResources allocatedResources
31: required TaskResources utilizedResources
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these should be optional instead of required

Comment on lines 75 to 86
struct DateRange {
1: string startDate
2: string endDate
}

struct TaskArgs {
1: list<string> args
2: map<string, string> env
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add optional to all fields - it is a convention in thrift to do so - to make future changes easy to ship.

Comment on lines 109 to 116
1: required orchestration.NodeKey node
10: required i64 submittedTs
20: optional i64 finishedTs
21: optional DateRange dateRange
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional everywhere please.


def handle(req: JobTrackerRequest): JobTrackerResponse = {
val result = new JobTrackerResponse()
result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a todo here

Comment on lines 110 to 115
10: required i64 submittedTs
20: optional i64 finishedTs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split these out into submissionInfo - at the time of submission we won't finishedTs etc.

@varant-zlai varant-zlai force-pushed the vz--create_backfill_tracker_thrift_apis branch from b98c268 to 3a53226 Compare January 20, 2025 18:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
api/thrift/hub.thrift (2)

79-80: 🛠️ Refactor suggestion

Use i64 timestamps instead of string dates.

-    1: string startDate
-    2: string endDate
+    1: i64 startTimestamp
+    2: i64 endTimestamp

101-109: ⚠️ Potential issue

Add missing comma after UPSTREAM_FAILED.

   RUNNING = 2,
   SUCCESS = 3,
   FAILED = 4,
-   UPSTREAM_FAILED = 5
+   UPSTREAM_FAILED = 5,
   UPSTREAM_MISSING = 6
🧹 Nitpick comments (2)
api/thrift/hub.thrift (2)

39-39: Document task deduplication strategy.

Add comment explaining how latest task per day is selected for overlapping ranges.


111-116: Consider adding status field to Submission.

Track submission state for better user feedback.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)

📥 Commits

Reviewing files that changed from the base of the PR and between a96f830 and 3a53226.

📒 Files selected for processing (2)
  • api/thrift/hub.thrift (1 hunks)
  • hub/src/main/scala/ai/chronon/hub/handlers/JobTracker.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hub/src/main/scala/ai/chronon/hub/handlers/JobTracker.scala
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: scala_compile_fmt_fix
  • GitHub Check: no_spark_scala_tests
🔇 Additional comments (2)
api/thrift/hub.thrift (2)

58-76: TaskInfo structure looks good.

Well-structured with optional fields and comprehensive metadata.


19-22: 🛠️ Refactor suggestion

Convert type to enum for type safety.

-    2: optional string type // physical type (limited to backfill or batch upload)
+    2: optional PhysicalType type

+enum PhysicalType {
+    BACKFILL = 0,
+    BATCH_UPLOAD = 1
+}

Likely invalid or redundant comment.

Comment on lines +32 to +35
1: optional string name
2: optional string type
3: optional string branch
10: optional DateRange dateRange // We may not need to use this, but in case it helps with page load times
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make type field consistent with LineageRequest.

Use the same PhysicalType enum here.

@varant-zlai varant-zlai force-pushed the vz--create_backfill_tracker_thrift_apis branch from 3a53226 to a591b53 Compare January 21, 2025 16:00
Copy link
Contributor

@ken-zlai ken-zlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a few notes from our discussion

SUCCESS = 3,
FAILED = 4,
UPSTREAM_FAILED = 5,
UPSTREAM_MISSING = 6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo add QUEUED here

31: optional TaskResources utilizedResources
}

struct DateRange {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format is yyyy-MM-dd. In the UI we treat this as it comes in, no adjusting for time zones.

BOTH = 2
}

struct TaskInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything with Ts is an epoch time in MS

ezvz and others added 9 commits January 22, 2025 14:50
@varant-zlai varant-zlai force-pushed the vz--create_backfill_tracker_thrift_apis branch from a591b53 to 9dc144a Compare January 22, 2025 19:50
@varant-zlai varant-zlai merged commit 9fb0e60 into main Jan 23, 2025
5 checks passed
@varant-zlai varant-zlai deleted the vz--create_backfill_tracker_thrift_apis branch January 23, 2025 03:39
@coderabbitai coderabbitai bot mentioned this pull request Feb 7, 2025
4 tasks
kumar-zlai pushed a commit that referenced this pull request Apr 25, 2025
## Summary

Input/out schema for the Job tracker and lineage API on ZiplineHub. 

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive job tracking system with support for
tracking job execution, status, and metadata.
	- Added a new API endpoint for handling job type requests.
- Defined structures for job tracking requests and responses, including
job metadata.
- Introduced enumerations for job processing direction, execution modes,
and statuses.
	- Added a new handler for processing job tracking requests.

- **Bug Fixes**
- Corrected a typographical error in a comment within the orchestration
Thrift file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
kumar-zlai pushed a commit that referenced this pull request Apr 29, 2025
## Summary

Input/out schema for the Job tracker and lineage API on ZiplineHub. 

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive job tracking system with support for
tracking job execution, status, and metadata.
	- Added a new API endpoint for handling job type requests.
- Defined structures for job tracking requests and responses, including
job metadata.
- Introduced enumerations for job processing direction, execution modes,
and statuses.
	- Added a new handler for processing job tracking requests.

- **Bug Fixes**
- Corrected a typographical error in a comment within the orchestration
Thrift file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
chewy-zlai pushed a commit that referenced this pull request May 15, 2025
## Summary

Input/out schema for the Job tracker and lineage API on ZiplineHub. 

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive job tracking system with support for
tracking job execution, status, and metadata.
	- Added a new API endpoint for handling job type requests.
- Defined structures for job tracking requests and responses, including
job metadata.
- Introduced enumerations for job processing direction, execution modes,
and statuses.
	- Added a new handler for processing job tracking requests.

- **Bug Fixes**
- Corrected a typographical error in a comment within the orchestration
Thrift file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
chewy-zlai pushed a commit that referenced this pull request May 15, 2025
## Summary

Input/out schema for the Job tracker and lineage API on ZiplineHub. 

## Checklist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive job tracking system with support for
tracking job execution, status, and metadata.
	- Added a new API endpoint for handling job type requests.
- Defined structures for job tracking requests and responses, including
job metadata.
- Introduced enumerations for job processing direction, execution modes,
and statuses.
	- Added a new handler for processing job tracking requests.

- **Bug Fixes**
- Corrected a typographical error in a comment within the orchestration
Thrift file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
chewy-zlai pushed a commit that referenced this pull request May 16, 2025
## Summary

Input/out schema for the Job traour clientser and lineage API on ZiplineHub. 

## Cheour clientslist
- [ ] Added Unit Tests
- [x] Covered by existing CI
- [ ] Integration tested
- [ ] Documentation update


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced a comprehensive job traour clientsing system with support for
traour clientsing job execution, status, and metadata.
	- Added a new API endpoint for handling job type requests.
- Defined structures for job traour clientsing requests and responses, including
job metadata.
- Introduced enumerations for job processing direction, execution modes,
and statuses.
	- Added a new handler for processing job traour clientsing requests.

- **Bug Fixes**
- Corrected a typographical error in a comment within the orchestration
Thrift file.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ezvz <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants