Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid errors in corner cases where Azure Service Principal Credentials are not available in Spark context #254

Merged
merged 1 commit into from
Sep 21, 2023

Conversation

nfx
Copy link
Collaborator

@nfx nfx commented Sep 21, 2023

This PR simplifies Table ACL crawling by removing the configurability of which databases to iterate - now, crawl_grants will crawl all databases consistently.

Fixes #247

@nfx nfx requested a review from larsgeorge-db as a code owner September 21, 2023 14:05
@codecov
Copy link

codecov bot commented Sep 21, 2023

Codecov Report

Merging #254 (4bc62f1) into main (6d30d12) will increase coverage by 0.05%.
The diff coverage is 86.84%.

❗ Current head 4bc62f1 differs from pull request most recent head 9be8abe. Consider uploading reports for the commit 9be8abe to get more accurate results

@@            Coverage Diff             @@
##             main     #254      +/-   ##
==========================================
+ Coverage   82.64%   82.70%   +0.05%     
==========================================
  Files          30       30              
  Lines        2011     2000      -11     
  Branches      344      339       -5     
==========================================
- Hits         1662     1654       -8     
+ Misses        280      278       -2     
+ Partials       69       68       -1     
Files Changed Coverage Δ
src/databricks/labs/ucx/runtime.py 53.03% <20.00%> (-1.52%) ⬇️
src/databricks/labs/ucx/hive_metastore/tables.py 94.66% <92.30%> (-1.34%) ⬇️
src/databricks/labs/ucx/assessment/assessment.py 61.70% <100.00%> (ø)
src/databricks/labs/ucx/config.py 84.61% <100.00%> (+3.48%) ⬆️
src/databricks/labs/ucx/hive_metastore/__init__.py 100.00% <100.00%> (ø)
src/databricks/labs/ucx/hive_metastore/grants.py 100.00% <100.00%> (ø)
src/databricks/labs/ucx/install.py 82.05% <100.00%> (ø)

... and 2 files with indirect coverage changes

…s are not available in Spark context

This PR simplifies Table ACL crawling by removing the configurability of which databases to iterate - now, `crawl_grants` will crawl all databases consistently.

Fixes #247
@nfx nfx merged commit 0bc27e5 into main Sep 21, 2023
@nfx nfx deleted the fix/247 branch September 21, 2023 14:13
@nfx nfx mentioned this pull request Sep 21, 2023
nfx added a commit that referenced this pull request Sep 21, 2023
* Added batched iteration for `INSERT INTO` queries in
`StatementExecutionBackend` with default `max_records_per_batch=1000`
([#237](#237)).
* Added crawler for mount points
([#209](#209)).
* Added crawlers for compatibility of jobs and clusters, along with
basic recommendations for external locations
([#244](#244)).
* Added safe return on grants
([#246](#246)).
* Added ability to specify empty group filter in the installer script
([#216](#216))
([#217](#217)).
* Added ability to install application by multiple different users on
the same workspace ([#235](#235)).
* Added dashboard creation on installation and a requirement for
`warehouse_id` in config, so that the assessment dashboards are
refreshed automatically after job runs
([#214](#214)).
* Added reliance on rate limiting from Databricks SDK for listing
workspace ([#258](#258)).
* Fixed errors in corner cases where Azure Service Principal Credentials
were not available in Spark context
([#254](#254)).
* Fixed `DESCRIBE TABLE` throwing errors when listing Legacy Table ACLs
([#238](#238)).
* Fixed `file already exists` error in the installer script
([#219](#219))
([#222](#222)).
* Fixed `guess_external_locations` failure with `AttributeError:
as_dict` and added an integration test
([#259](#259)).
* Fixed error handling edge cases in `crawl_tables` task
([#243](#243))
([#251](#251)).
* Fixed `crawl_permissions` task failure on folder names containing a
forward slash ([#234](#234)).
* Improved `README` notebook documentation
([#260](#260),
[#228](#228),
[#252](#252),
[#223](#223),
[#225](#225)).
* Removed redundant `.python-version` file
([#221](#221)).
* Removed discovery of account groups from `crawl_permissions` task
([#240](#240)).
* Updated databricks-sdk requirement from ~=0.8.0 to ~=0.9.0
([#245](#245)).
larsgeorge-db pushed a commit that referenced this pull request Sep 23, 2023
…s are not available in Spark context (#254)

This PR simplifies Table ACL crawling by removing the configurability of
which databases to iterate - now, `crawl_grants` will crawl all
databases consistently.

Fixes #247
larsgeorge-db pushed a commit that referenced this pull request Sep 23, 2023
* Added batched iteration for `INSERT INTO` queries in
`StatementExecutionBackend` with default `max_records_per_batch=1000`
([#237](#237)).
* Added crawler for mount points
([#209](#209)).
* Added crawlers for compatibility of jobs and clusters, along with
basic recommendations for external locations
([#244](#244)).
* Added safe return on grants
([#246](#246)).
* Added ability to specify empty group filter in the installer script
([#216](#216))
([#217](#217)).
* Added ability to install application by multiple different users on
the same workspace ([#235](#235)).
* Added dashboard creation on installation and a requirement for
`warehouse_id` in config, so that the assessment dashboards are
refreshed automatically after job runs
([#214](#214)).
* Added reliance on rate limiting from Databricks SDK for listing
workspace ([#258](#258)).
* Fixed errors in corner cases where Azure Service Principal Credentials
were not available in Spark context
([#254](#254)).
* Fixed `DESCRIBE TABLE` throwing errors when listing Legacy Table ACLs
([#238](#238)).
* Fixed `file already exists` error in the installer script
([#219](#219))
([#222](#222)).
* Fixed `guess_external_locations` failure with `AttributeError:
as_dict` and added an integration test
([#259](#259)).
* Fixed error handling edge cases in `crawl_tables` task
([#243](#243))
([#251](#251)).
* Fixed `crawl_permissions` task failure on folder names containing a
forward slash ([#234](#234)).
* Improved `README` notebook documentation
([#260](#260),
[#228](#228),
[#252](#252),
[#223](#223),
[#225](#225)).
* Removed redundant `.python-version` file
([#221](#221)).
* Removed discovery of account groups from `crawl_permissions` task
([#240](#240)).
* Updated databricks-sdk requirement from ~=0.8.0 to ~=0.9.0
([#245](#245)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Grant crawler fails when invalid config value detected for storage mount
1 participant