New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Extract command codes and unify the checks for spark_conf, cluster_policy, init_scripts #855

Merged

nfx merged 9 commits into main from fix/unify_conf_check_823

Jan 30, 2024

Contributor

qziyuan commented Jan 29, 2024 •

edited

Loading

Changes

Extract the command codes for checking spark_conf, cluster_policy, init_scripts and put them into unified functions in
src/databricks/labs/ucx/assessment/crawlers.py.
The new functions will then be called by:

src/databricks/labs/ucx/assessment/clusters.py
src/databricks/labs/ucx/assessment/init_scripts.py
src/databricks/labs/ucx/assessment/jobs.py
src/databricks/labs/ucx/assessment/pipelines.py

Linked issues

Resolves #823

Functionality

added relevant user documentation
added new CLI command
modified existing command: databricks labs ucx ...
added a new workflow
modified existing workflow: ...
added a new table
modified existing table: ...

Tests

manually tested
added unit tests
added integration tests
verified on staging environment (screenshot attached)

qziyuan added 6 commits

January 24, 2024 00:21


          refactor _check_spark_conf and move it to assessment/crawlers.py

6abfa89


          move the functions for checking cluster policy and init script to cra…

ebeb187

…wlers.py and let clusters, jobs, pipelines, init_scripts call those functions.


          Merge remote-tracking branch 'origin/main' into fix/unify_conf_check_823

a424045

- fix conflicts in assessment/clusters.py and assessment/jobs.py from the PR #825 and PR #838
- move _check_cluster_failures logic into assessment/crawlers.py and let jobs and clusters call this function


          - keep _check_cluster_failures logic inside assessment/crawlers.py

0e75835

- applies the _check_cluster_failures changes from PR #845.


          Merge remote-tracking branch 'origin/main' into fix/unify_conf_check_823

a28e654

- Merge change from PR #845.
- Move the _check_cluster_failures logic to assessment/crawlers.py
- Remove the ClusterInfo logic from _check_cluster_failures as it ties to assessment/clusters.py and should not be involved when assessment/jobs.py calls it.


          add unit tests for:

c859a1a

- filtering out job cluster when scan all purpose cluster
- _try_fetch for ClustersCrawler

qziyuan requested review from a team and renardeinside

January 29, 2024 20:08

qziyuan had a problem deploying to account-admin

January 29, 2024 20:08

— with

GitHub Actions Failure

codecov bot commented Jan 29, 2024 •

edited

Loading

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (606dd72) 85.67% compared to head (06ebd30) 85.86%.
Report is 1 commits behind head on main.

Files	Patch %	Lines
src/databricks/labs/ucx/assessment/clusters.py	86.15%	4 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #855      +/-   ##
==========================================
+ Coverage   85.67%   85.86%   +0.19%     
==========================================
  Files          42       42              
  Lines        5311     5335      +24     
  Branches      969      968       -1     
==========================================
+ Hits         4550     4581      +31     
+ Misses        542      537       -5     
+ Partials      219      217       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


          revert the chnages from auto format

1a6e1fa

qziyuan temporarily deployed to account-admin

January 29, 2024 20:16

— with

GitHub Actions Inactive

nfx requested changes

View reviewed changes

Collaborator

nfx left a comment

don't import private functions from modules. fix bugs related to union types. you've effectively re-introduced a bug in this PR. I'm fine with getting rid of mixins, but then make top-level functions in the crawlers.py type annotated. i'd still prefer if those functions live in a domain-groupped modules - e.g. all cluster-related functions and classes - in clusters.py, all job-related functions - in jobs.py, etc

src/databricks/labs/ucx/assessment/crawlers.py Outdated

                   if (10, 0) <= version < (11, 3):
                       return "kinda works"
                   return "supported"
+              def _check_spark_conf(conf: dict[str, str], source) -> list[str]:

Collaborator

nfx Jan 29, 2024

these functions are private (starting with _). you cannot export private methods from a module.

Contributor Author

qziyuan Jan 30, 2024

Moved those functions back to cluster mixin. I make check_cluster_failures and check_spark_conf public function now as the former is used by jobs.py and the latter is used by pipelines.py also.

src/databricks/labs/ucx/assessment/crawlers.py Outdated

		return None


		def _check_cluster_policy(ws: WorkspaceClient, cluster, source):

Collaborator

nfx Jan 29, 2024

let's add type annotations to top-level members if we really don't want mixins

Contributor Author

qziyuan Jan 30, 2024

fixed, type annotations added even we go back to mixin.

src/databricks/labs/ucx/assessment/crawlers.py Outdated

		return failures


		def _check_cluster_failures(ws: WorkspaceClient, cluster: ClusterDetails \| compute.ClusterSpec, source):

Collaborator

nfx Jan 29, 2024

don't do union types, they already caused a lot of bugs.

convert cluster spec into cluster details instead:

https://github.com/databrickslabs/ucx/blob/main/src/databricks/labs/ucx/assessment/jobs.py#L88-L89

Contributor Author

qziyuan Jan 30, 2024

fixed, it was a miss when merge with #845

src/databricks/labs/ucx/assessment/crawlers.py Outdated

		return failures


		def _safe_get_cluster_policy(ws: WorkspaceClient, policy_id: str) -> Policy \| None:

Collaborator

nfx Jan 29, 2024

so, logical question - do we want to do diamond shape dependencies or not? e.g. _safe_get_cluster_policy - where it should live? in crawlers.py or in clusters.py? what about _check_cluster_failures?

flowchart TD
    assessment --> crawlers
    crawlers --> clusters
    crawlers --> jobs
    crawlers --> init_scripts
    crawlers --> pipelines
    crawlers --> azure_spns
    clusters --> runtime.py
    jobs --> runtime.py
    init_scripts --> runtime.py
    pipelines --> runtime.py
    azure_spns --> runtime.py

or something like this:

flowchart TD
    assessment --> crawlers
    azure_spns -->|mixin| clusters
    crawlers --> clusters
    clusters -->|mixin| jobs
    crawlers --> jobs
    azure_spns -->|mixin| init_scripts
    crawlers --> init_scripts
    jobs -->|mixin| pipelines
    crawlers --> pipelines
    crawlers --> azure_spns
    clusters --> runtime.py
    jobs --> runtime.py
    init_scripts --> runtime.py
    pipelines --> runtime.py
    azure_spns --> runtime.py

Contributor Author

qziyuan Jan 29, 2024 •

edited

Loading

I'd like to select the first dependency structure, it's more clear to me.
Per my understanding cluster.py crawl and check all the all-purpose clusters, jobs.py crawl and check all jobs (so far it only check job cluster, but we may need to check the job code in the future), pipelines.py crawl and check all DLT pipelines (right now it only checks pipeline config, but it should check the pipeline cluster as well).
It's clear for them to inherent the _check_cluster_failures function to check spark conf, init script, cluster policy, instead of letting jobs and pipelines to inherent _check_cluster_failures from the clusters.
There are some logical that may stay in the domain-groupped modules, because they are not commonly shared across crawlers:

check all-purpose cluster mode (we don't have this logical yet, but may have it in the future). The cluster may need to be put into shared mode which has more limitations, but this check does not apply to job cluster.
check job code (we don't have this logical yet, but may have it in the future)

src/databricks/labs/ucx/assessment/crawlers.py Outdated

		return failures


		def _check_cluster_init_script(ws: WorkspaceClient, init_scripts, source):

Collaborator

nfx Jan 29, 2024

if we're not doing mixins, then why this function is defined so far away from _get_init_script_data?

Contributor Author

qziyuan Jan 30, 2024

move them to mixin and make these two functions next to each other.

src/databricks/labs/ucx/assessment/crawlers.py Outdated

Comment on lines 8 to 9

		from databricks.sdk.service import compute
		from databricks.sdk.service.compute import ClusterDetails, Policy

Collaborator

nfx Jan 29, 2024

line 9 is redundant because you've imported the whole compute package. use one style of imports, don't mix

Contributor Author

qziyuan Jan 30, 2024

fixed

src/databricks/labs/ucx/assessment/clusters.py Outdated

-                  logger,
-                  spark_version_compatibility,
-              )
+              from databricks.labs.ucx.assessment.crawlers import _check_cluster_failures, logger

Collaborator

nfx Jan 29, 2024

don't import logger, initialize one in the top of the module.

Contributor Author

qziyuan Jan 30, 2024

fixed


          1. Move check_cluster_failures back into the mixin of cluster.py.…

fe7ff09

… 2. Make jobs inherenting the cluster mixin so it can reuse the `check_cluster_failures` 3. Make pipelines inherenting the cluster mixin so it can reuse the `check_spark_conf`. 4. Move `check_init_script` to mixin of `init_scripts.py`

qziyuan temporarily deployed to account-admin

January 30, 2024 06:24

— with

GitHub Actions Inactive


          Let jobs.py pass in ClusterDetails when calling check_cluster_failures

06ebd30

qziyuan temporarily deployed to account-admin

January 30, 2024 06:40

— with

GitHub Actions Inactive

nfx approved these changes

View reviewed changes

Collaborator

nfx left a comment

Few nits remaining

src/databricks/labs/ucx/assessment/clusters.py

+                                  try:
+                                      data = self._ws.dbfs.read(file_api_format_destination).data
+                                      if data is not None:
+                                          return base64.b64decode(data).decode("utf-8")

Collaborator

nfx Jan 30, 2024

That's 9 levels of nesting. Can you reduce the nesting, so that's a bit more readable?

src/databricks/labs/ucx/assessment/clusters.py

+                      return failures
+                  def _get_init_script_data(self, init_script_info: InitScriptInfo) -> str | None:
+                      if init_script_info.dbfs is not None and init_script_info.dbfs.destination is not None:

Collaborator

nfx Jan 30, 2024

Nesting can be reduced by refactoring "if COND: 20 lines" into "if not COND: continue 20 lines"

src/databricks/labs/ucx/assessment/clusters.py

               from databricks.labs.ucx.assessment.crawlers import (
                   _AZURE_SP_CONF_FAILURE_MSG,
+                  _INIT_SCRIPT_DBFS_PATH,

Collaborator

nfx Jan 30, 2024

This constant can live in the init script mixin

src/databricks/labs/ucx/assessment/jobs.py

@@ @@ -20,7 +22,7 @@ class JobInfo: @@
                   creator: str | None = None
-              class JobsMixin(ClustersMixin):
+              class JobsMixin:

Collaborator

nfx Jan 30, 2024

For consistency, shouldn't it be named CheckJobsMixin?

nfx merged commit 2ab1321 into main

7 checks passed

nfx deleted the fix/unify_conf_check_823 branch

January 30, 2024 09:04

nfx added a commit that referenced this pull request


          Release v0.11.1

202f52a

* Added "what" property for migration to scope down table migrations ([#856](#856)).
* Added job count in the assessment dashboard ([#858](#858)).
* Adopted `installation` package from `databricks-labs-blueprint` ([#860](#860)).
* Debug logs to print only the first 96 bytes of SQL query by default, tunable by `debug_truncate_bytes` SDK configuration property ([#859](#859)).
* Extract command codes and unify the checks for spark_conf, cluster_policy, init_scripts ([#855](#855)).
* Improved installation failure with actionable message ([#840](#840)).
* Improved validating groups membership cli command ([#816](#816)).

Dependency updates:

 * Updated databricks-labs-blueprint requirement from ~=0.1.0 to ~=0.2.4 ([#867](#867)).

nfx mentioned this pull request

Release v0.11.1 #870

Merged

nfx added a commit that referenced this pull request


          Release v0.11.1 (#870)

e11494c

* Added "what" property for migration to scope down table migrations
([#856](#856)).
* Added job count in the assessment dashboard
([#858](#858)).
* Adopted `installation` package from `databricks-labs-blueprint`
([#860](#860)).
* Debug logs to print only the first 96 bytes of SQL query by default,
tunable by `debug_truncate_bytes` SDK configuration property
([#859](#859)).
* Extract command codes and unify the checks for spark_conf,
cluster_policy, init_scripts
([#855](#855)).
* Improved installation failure with actionable message
([#840](#840)).
* Improved validating groups membership cli command
([#816](#816)).

Dependency updates:

* Updated databricks-labs-blueprint requirement from ~=0.1.0 to ~=0.2.4
([#867](#867)).

dmoore247 pushed a commit that referenced this pull request


          Extract command codes and unify the checks for spark_conf, cluster_po…

8fded13

…licy, init_scripts (#855)

dmoore247 pushed a commit that referenced this pull request


          Release v0.11.1 (#870)

0fb6c98

* Added "what" property for migration to scope down table migrations
([#856](#856)).
* Added job count in the assessment dashboard
([#858](#858)).
* Adopted `installation` package from `databricks-labs-blueprint`
([#860](#860)).
* Debug logs to print only the first 96 bytes of SQL query by default,
tunable by `debug_truncate_bytes` SDK configuration property
([#859](#859)).
* Extract command codes and unify the checks for spark_conf,
cluster_policy, init_scripts
([#855](#855)).
* Improved installation failure with actionable message
([#840](#840)).
* Improved validating groups membership cli command
([#816](#816)).

Dependency updates:

* Updated databricks-labs-blueprint requirement from ~=0.1.0 to ~=0.2.4
([#867](#867)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet