-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawler for Externally Orchestrated Jobs with Failing Configuration #395
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #395 +/- ##
==========================================
- Coverage 85.67% 82.61% -3.07%
==========================================
Files 42 30 -12
Lines 5311 2490 -2821
Branches 969 445 -524
==========================================
- Hits 4550 2057 -2493
+ Misses 542 326 -216
+ Partials 219 107 -112 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a passing integration test. please attach a screenshot once you get it to work locally
@@ -57,6 +61,14 @@ class PipelineInfo: | |||
failures: str | |||
|
|||
|
|||
@dataclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to capture only the failing ones or all of them with the failing one anotated?
def is_custom_image(version_string: str): | ||
""" | ||
Is this a custom version? | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it require implementation?
pattern = r"^(?P<major>\d+)?\.(?P<minor>\d+)?\.(?P<patch>[\dx]+)?.*" | ||
lvg = re.match(pattern, left_version) | ||
rvg = re.match(pattern, right_version) | ||
left = (int(lvg.group("major")), int(lvg.group("minor"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a proper unit test for that?
@@ -84,6 +119,26 @@ def spark_version_compatibility(spark_version: str) -> str: | |||
return "supported" | |||
|
|||
|
|||
def get_job_cluster_from_task( | |||
task: RunTask, job_run: BaseRun, all_clusters: dict[str, ClusterDetails] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re factor job cluster to use the same mechanism
self._ws = ws | ||
|
||
def _crawl(self) -> list[ExternallyOrchestratedJobRunWithFailingConfiguration]: | ||
no_of_days_back = datetime.timedelta(days=30) # todo make configurable in yaml? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make timedelta externally configurable
02dfe85
to
ea103b4
Compare
ea103b4
to
14daf86
Compare
…estrator-job-run-crawler # Conflicts: # src/databricks/labs/ucx/assessment/crawlers.py # tests/unit/assessment/test_assessment.py
Resolves #266
Added ExternallyOrchestratedJobsWithFailingConfigCrawler
Added a crawler to look at JobRuns from the SDK and determine which of the job runs are from the RunsSubmit API
Added Unit Tests
Added tests to cover basic logic and some edge cases
Integration Tests Pending