Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: asses_jobs task fails #838

Closed
1 task done
geiranton opened this issue Jan 25, 2024 · 2 comments · Fixed by #845
Closed
1 task done

[BUG]: asses_jobs task fails #838

geiranton opened this issue Jan 25, 2024 · 2 comments · Fixed by #845

Comments

@geiranton
Copy link

geiranton commented Jan 25, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

`AttributeError: 'ClusterSpec' object has no attribute 'creator_user_name'

AttributeError Traceback (most recent call last)
File ~/.ipykernel/1139/command--1-1486021540:18
15 entry = [ep for ep in metadata.distribution("databricks_labs_ucx").entry_points if ep.name == "runtime"]
16 if entry:
17 # Load and execute the entrypoint, assumes no parameters
---> 18 entry[0].load()()
19 else:
20 import databricks_labs_ucx

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/runtime.py:365, in main(*argv)
363 if len(argv) == 0:
364 argv = sys.argv
--> 365 trigger(*argv)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/tasks.py:245, in trigger(*argv)
243 ucx_logger = logging.getLogger("databricks.labs.ucx")
244 ucx_logger.info(f"UCX v{version} After job finishes, see debug logs at {task_logger}")
--> 245 current_task.fn(cfg)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/runtime.py:114, in assess_jobs(cfg)
112 ws = WorkspaceClient(config=cfg.to_databricks_config())
113 crawler = JobsCrawler(ws, RuntimeBackend(), cfg.inventory_database)
--> 114 crawler.snapshot()

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/jobs.py:99, in JobsCrawler.snapshot(self)
98 def snapshot(self) -> Iterable[JobInfo]:
---> 99 return self._snapshot(self._try_fetch, self._crawl)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/crawlers.py:293, in CrawlerBase._snapshot(self, fetcher, loader)
291 pass
292 logger.debug(f"[{self._full_name}] crawling new batch for {self._table}")
--> 293 loaded_records = list(loader())
294 self._append_records(loaded_records)
295 return loaded_records

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/jobs.py:54, in JobsCrawler._crawl(self)
52 all_jobs = list(self._ws.jobs.list(expand_tasks=True))
53 all_clusters = {c.cluster_id: c for c in self._ws.clusters.list()}
---> 54 return self._assess_jobs(all_jobs, all_clusters)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/jobs.py:87, in JobsCrawler._assess_jobs(self, all_jobs, all_clusters_by_id)
85 if not job_id:
86 continue
---> 87 cluster_failures = self._check_cluster_failures(cluster_config)
88 for failure in json.loads(cluster_failures.failures):
89 job_assessment[job_id].add(failure)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/clusters.py:72, in ClustersMixin._check_cluster_failures(self, cluster)
70 def _check_cluster_failures(self, cluster: ClusterDetails):
71 failures = []
---> 72 if not cluster.creator_user_name:
73 logger.warning(
74 f"Cluster {cluster.cluster_id} have Unknown creator, it means that the original creator "
75 f"has been deleted and should be re-created"
76 )
77 cluster_info = ClusterInfo(
78 cluster_id=cluster.cluster_id if cluster.cluster_id else "",
79 cluster_name=cluster.cluster_name,
(...)
82 failures="[]",
83 )

AttributeError: 'ClusterSpec' object has no attribute 'creator_user_name'`

Expected Behavior

Steps To Reproduce

Run "[UCX] assessment" workflow

Cloud

Azure

Operating System

Linux

Version

via install.sh

Relevant log output

Workflows
Jobs
[UCX] assessment
Run 391289903162087
assess_jobs run 
Output
AttributeError: 'ClusterSpec' object has no attribute 'creator_user_name'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File ~/.ipykernel/1139/command--1-1486021540:18
     15 entry = [ep for ep in metadata.distribution("databricks_labs_ucx").entry_points if ep.name == "runtime"]
     16 if entry:
     17   # Load and execute the entrypoint, assumes no parameters
---> 18   entry[0].load()()
     19 else:
     20   import databricks_labs_ucx

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/runtime.py:365, in main(*argv)
    363 if len(argv) == 0:
    364     argv = sys.argv
--> 365 trigger(*argv)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/tasks.py:245, in trigger(*argv)
    243 ucx_logger = logging.getLogger("databricks.labs.ucx")
    244 ucx_logger.info(f"UCX v{__version__} After job finishes, see debug logs at {task_logger}")
--> 245 current_task.fn(cfg)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/runtime.py:114, in assess_jobs(cfg)
    112 ws = WorkspaceClient(config=cfg.to_databricks_config())
    113 crawler = JobsCrawler(ws, RuntimeBackend(), cfg.inventory_database)
--> 114 crawler.snapshot()

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/jobs.py:99, in JobsCrawler.snapshot(self)
     98 def snapshot(self) -> Iterable[JobInfo]:
---> 99     return self._snapshot(self._try_fetch, self._crawl)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/framework/crawlers.py:293, in CrawlerBase._snapshot(self, fetcher, loader)
    291     pass
    292 logger.debug(f"[{self._full_name}] crawling new batch for {self._table}")
--> 293 loaded_records = list(loader())
    294 self._append_records(loaded_records)
    295 return loaded_records

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/jobs.py:54, in JobsCrawler._crawl(self)
     52 all_jobs = list(self._ws.jobs.list(expand_tasks=True))
     53 all_clusters = {c.cluster_id: c for c in self._ws.clusters.list()}
---> 54 return self._assess_jobs(all_jobs, all_clusters)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/jobs.py:87, in JobsCrawler._assess_jobs(self, all_jobs, all_clusters_by_id)
     85 if not job_id:
     86     continue
---> 87 cluster_failures = self._check_cluster_failures(cluster_config)
     88 for failure in json.loads(cluster_failures.failures):
     89     job_assessment[job_id].add(failure)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/databricks/labs/ucx/assessment/clusters.py:72, in ClustersMixin._check_cluster_failures(self, cluster)
     70 def _check_cluster_failures(self, cluster: ClusterDetails):
     71     failures = []
---> 72     if not cluster.creator_user_name:
     73         logger.warning(
     74             f"Cluster {cluster.cluster_id} have Unknown creator, it means that the original creator "
     75             f"has been deleted and should be re-created"
     76         )
     77     cluster_info = ClusterInfo(
     78         cluster_id=cluster.cluster_id if cluster.cluster_id else "",
     79         cluster_name=cluster.cluster_name,
   (...)
     82         failures="[]",
     83     )

AttributeError: 'ClusterSpec' object has no attribute 'creator_user_name'
UCX v0.10.1+720240124152200
Scans through all the jobs and identifies those that are not compatible with UC. The list of all the jobs is
stored in the `$inventory.jobs` table.

It looks for:
  - Clusters with Databricks Runtime (DBR) version earlier than 11.3
  - Clusters using Passthrough Authentication
  - Clusters with incompatible Spark config tags
  - Clusters referencing DBFS locations in one or more config options

14:42:50  INFO [d.labs.ucx] UCX v0.10.1+720240124152200 After job finishes, see debug logs at /Workspace/Users/[email protected]/.ucx/logs/assessment/run-391289903162087/assess_jobs.log

UCX v0.10.1+720240124152200
Scans through all the jobs and identifies those that are not compatible with UC. The list of all the jobs is
stored in the `$inventory.jobs` table.

It looks for:
  - Clusters with Databricks Runtime (DBR) version earlier than 11.3
  - Clusters using Passthrough Authentication
  - Clusters with incompatible Spark config tags
  - Clusters referencing DBFS locations in one or more config options

14:42:50  INFO [d.labs.ucx] UCX v0.10.1+720240124152200 After job finishes, see debug logs at /Workspace/Users/[email protected]/.ucx/logs/assessment/run-391289903162087/assess_jobs.log
14:43:05 ERROR [d.labs.ucx] Execute `databricks workspace export //Users/[email protected]/.ucx/logs/assessment/run-391289903162087/assess_jobs.log` locally to troubleshoot with more details. 'ClusterSpec' object has no attribute 'creator_user_name'
Task run details
Job ID
974213243130635 
Job run ID
391289903162087 
Task run ID
949601116928966 
Run as
Geir Antonsen
Launched
Manually
Started
01/24/2024, 03:40:25 PM
Ended
01/24/2024, 03:43:10 PM
Duration
2m 44s
Queue duration
-
Status
Failed Failed
Lineage
No lineage information for this job.
Learn more 
Python wheel
Package name
databricks_labs_ucx
Entry point
runtime
Compute
main_v1
Driver: Standard_F4s · Workers: Standard_F4s · 0 workers · 14.2 (includes Apache Spark 3.5.0, Scala 2.12)
Dependent libraries
dbfs:/Users/[email protected]/.ucx/wheels/databricks_labs_ucx-0.10.1+720240124152200-py3-none-any.whl (Wheel)
Parameters
config
/Workspace/Users/[email protected]/.ucx/config.yml
job_id
974213243130635 (resolved)
parent_run_id
391289903162087 (resolved)
run_id
949601116928966 (resolved)
task
assess_jobs
@geiranton
Copy link
Author

ClusterDetails objects is not complete. When adding hasattr(cluster, "creator_user_name") etc and logging cluster object as dict then output is typical:
11:07:55 WARN [d.l.u.assessment.crawlers] Cluster {'apply_policy_default_values': None, 'autoscale': None, 'autotermination_minutes': None, 'aws_attributes': None, 'azure_attributes': AzureAttributes(availability=<AzureAvailability.ON_DEMAND_AZURE: 'ON_DEMAND_AZURE'>, first_on_demand=1, log_analytics_info=None, spot_bid_max_price=-1.0), 'cluster_log_conf': None, 'cluster_name': None, 'cluster_source': <ClusterSource.JOB: 'JOB'>, 'custom_tags': {'ResourceClass': 'SingleNode'}, 'data_security_mode': <DataSecurityMode.NONE: 'NONE'>, 'docker_image': None, 'driver_instance_pool_id': None, 'driver_node_type_id': None, 'enable_elastic_disk': True, 'enable_local_disk_encryption': None, 'gcp_attributes': None, 'init_scripts': [], 'instance_pool_id': None, 'node_type_id': 'Standard_DS3_v2', 'num_workers': 0, 'policy_id': None, 'runtime_engine': <RuntimeEngine.STANDARD: 'STANDARD'>, 'single_user_name': None, 'spark_conf': {'spark.master': 'local[*, 4]', 'spark.databricks.cluster.profile': 'singleNode'}, 'spark_env_vars': {'PYSPARK_PYTHON': '/databricks/python3/bin/python3', 'LOGGER_URL': '', 'LOGGER_LEVEL': 'INFO'}, 'spark_version': '13.3.x-scala2.12', 'ssh_public_keys': None, 'workload_type': None} have Unknown creator, it means that the original creator has been deleted and should be re-created

@nfx
Copy link
Collaborator

nfx commented Jan 25, 2024

@geiranton you're using unreleased version at your own risk ;)

qziyuan added a commit that referenced this issue Jan 25, 2024
- fix conflicts in assessment/clusters.py and assessment/jobs.py from the PR #825 and PR #838
- move _check_cluster_failures logic into assessment/crawlers.py and let jobs and clusters call this function
@nfx nfx closed this as completed in #845 Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants