Skip to content

Commit ee923f4

Browse files
varant-zlaiezvzcoderabbitai[bot]
authored
WIP -- create backfill tracker thrift apis (#224)
## Summary Input/out schema for the Job tracker and lineage API on ZiplineHub. ## Checklist - [ ] Added Unit Tests - [x] Covered by existing CI - [ ] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a comprehensive job tracking system with support for tracking job execution, status, and metadata. - Added a new API endpoint for handling job type requests. - Defined structures for job tracking requests and responses, including job metadata. - Introduced enumerations for job processing direction, execution modes, and statuses. - Added a new handler for processing job tracking requests. - **Bug Fixes** - Corrected a typographical error in a comment within the orchestration Thrift file. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: ezvz <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent b17d974 commit ee923f4

File tree

2 files changed

+119
-1
lines changed

2 files changed

+119
-1
lines changed

api/thrift/hub.thrift

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
namespace py ai.chronon.hub
2+
namespace java ai.chronon.hub
3+
4+
include "common.thrift"
5+
include "api.thrift"
6+
include "orchestration.thrift"
7+
8+
9+
/*
10+
GroupBy APIs
11+
*/
12+
13+
14+
// For an entity-job page, we first call Lineage to get the node graph
15+
// Then we call JobTrackerRequest on the first level of nodes to render the tasks for the landing page quickly
16+
// Then we traverse the graph and load the rest of the tasks and statuses
17+
// If the page was accessed via a "submission" link, then we also render the submission range
18+
struct LineageRequest {
19+
1: optional string name
20+
2: optional string type // physical type (limited to backfill or batch upload)
21+
3: optional string branch
22+
4: optional Direction direction
23+
24+
}
25+
26+
struct LineageResponse {
27+
1: optional orchestration.NodeGraph nodeGraph
28+
2: optional orchestration.NodeKey mainNode // Same as the node in the LineageRequest
29+
}
30+
31+
struct JobTrackerRequest {
32+
1: optional string name
33+
2: optional string type
34+
3: optional string branch
35+
10: optional DateRange dateRange // We may not need to use this, but in case it helps with page load times
36+
}
37+
38+
struct JobTrackerResponse {
39+
1: optional list<TaskInfo> tasks // Date ranges can overlap for tasks (reruns, retries etc). Need to render latest per day.
40+
2: optional orchestration.NodeKey mainNode // Same as the node in the JobTrackerRequest
41+
}
42+
43+
// Submissions are used to render user's recent jobs on their homepage
44+
struct SubmissionsRequest {
45+
1: optional string user
46+
}
47+
48+
struct SubmissionsResponse {
49+
1: optional list<Submission> submissions
50+
}
51+
52+
enum Direction {
53+
UPSTREAM = 0,
54+
DOWNSTREAM = 1,
55+
BOTH = 2
56+
}
57+
58+
struct TaskInfo {
59+
1: optional Status status
60+
2: optional string logPath
61+
3: optional string trackerUrl
62+
4: optional TaskArgs taskArgs
63+
5: optional DateRange dateRange // specific to batch nodes
64+
65+
// time information - useful for gantt / waterfall view
66+
10: optional i64 submittedTs
67+
11: optional i64 startedTs
68+
12: optional i64 finishedTs
69+
70+
20: optional string user
71+
21: optional string team
72+
73+
// utilization information
74+
30: optional TaskResources allocatedResources
75+
31: optional TaskResources utilizedResources
76+
}
77+
78+
struct DateRange {
79+
1: string startDate
80+
2: string endDate
81+
}
82+
83+
struct TaskArgs {
84+
1: optional list<string> argsList
85+
2: optional map<string, string> env
86+
}
87+
88+
struct TaskResources {
89+
1: optional i64 vcoreSeconds
90+
2: optional i64 megaByteSeconds
91+
3: optional i64 cumulativeDiskWriteBytes
92+
4: optional i64 cumulativeDiskReadBytes
93+
}
94+
95+
96+
enum Mode {
97+
ADHOC = 0,
98+
SCHEDULED = 1
99+
}
100+
101+
enum Status {
102+
WAITING_FOR_UPSTREAM = 0,
103+
WAITING_FOR_RESOURCES = 1,
104+
QUEUED = 2,
105+
RUNNING = 3,
106+
SUCCESS = 4,
107+
FAILED = 5,
108+
UPSTREAM_FAILED = 6,
109+
UPSTREAM_MISSING = 7
110+
}
111+
112+
struct Submission {
113+
1: optional orchestration.NodeKey node
114+
10: optional i64 submittedTs
115+
20: optional i64 finishedTs
116+
21: optional DateRange dateRange
117+
}

api/thrift/orchestration.thrift

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,8 @@ union PhysicalNodeType {
145145
4: ModelNodeType modelNodeType
146146
5: TableNodeType tableNodeType
147147
}
148-
// ====================== End of phsical node types ======================
148+
149+
// ====================== End of physical node types ======================
149150

150151
/**
151152
* Multiple logical nodes could share the same physical node

0 commit comments

Comments
 (0)