-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added databricks.labs.blueprint.paths.WorkspacePath
as pathlib.Path
equivalent
#115
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✅ 18/18 passed, 2 skipped, 41s total Running from acceptance #151 |
nfx
added a commit
that referenced
this pull request
Jul 5, 2024
* Added `databricks.labs.blueprint.paths.WorkspacePath` as `pathlib.Path` equivalent ([#115](#115)). This commit introduces the `databricks.labs.blueprint.paths.WorkspacePath` library, providing Python-native `pathlib.Path`-like interfaces to simplify working with Databricks Workspace paths. The library includes `WorkspacePath` and `WorkspacePathDuringTest` classes offering advanced functionality for handling user home folders, relative file paths, browser URLs, and file manipulation methods such as `read/write_text()`, `read/write_bytes()`, and `glob()`. This addition brings enhanced, Pythonic ways to interact with Databricks Workspace paths, including creating and moving files, managing directories, and generating browser-accessible URIs. Additionally, the commit includes updates to existing methods and introduces new fixtures for creating notebooks, accompanied by extensive unit tests to ensure reliability and functionality. * Added propagation of `blueprint` version into `User-Agent` header when it is used as library ([#114](#114)). A new feature has been introduced in the library that allows for the propagation of the `blueprint` version and the name of the command line interface (CLI) command used in the `User-Agent` header when the library is utilized as a library. This feature includes the addition of two new pairs of `OtherInfo`: `blueprint/X.Y.Z` to indicate that the request is made using the `blueprint` library and `cmd/<name>` to store the name of the CLI command used for making the request. The implementation involves using the `with_user_agent_extra` function from `databricks.sdk.config` to set the user agent consistently with the Databricks CLI. Several changes have been made to the test file for `test_useragent.py` to include a new test case, `test_user_agent_is_propagated`, which checks if the `blueprint` version and the name of the command are correctly propagated to the `User-Agent` header. A context manager `http_fixture_server` has been added that creates an HTTP server with a custom handler, which extracts the `blueprint` version and the command name from the `User-Agent` header and stores them in the `user_agent` dictionary. The test case calls the `foo` command with a mocked `WorkspaceClient` instance and sets the `DATABRICKS_HOST` and `DATABRICKS_TOKEN` environment variables to test the propagation of the `blueprint` version and the command name in the `User-Agent` header. The test case then asserts that the `blueprint` version and the name of the command are present and correctly set in the `user_agent` dictionary. * Bump actions/checkout from 4.1.6 to 4.1.7 ([#112](#112)). In this release, the version of the "actions/checkout" action used in the `Checkout Code` step of the acceptance workflow has been updated from 4.1.6 to 4.1.7. This update may include bug fixes, performance improvements, and new features, although specific changes are not mentioned in the commit message. The `Unshallow` step remains unchanged, continuing to fetch and clean up the repository's history. This update ensures that the latest enhancements from the "actions/checkout" action are utilized, aiming to improve the reliability and performance of the code checkout process in the GitHub Actions workflow. Software engineers should be aware of this update and its potential impact on their workflows. Dependency updates: * Bump actions/checkout from 4.1.6 to 4.1.7 ([#112](#112)).
Merged
nfx
added a commit
that referenced
this pull request
Jul 5, 2024
* Added `databricks.labs.blueprint.paths.WorkspacePath` as `pathlib.Path` equivalent ([#115](#115)). This commit introduces the `databricks.labs.blueprint.paths.WorkspacePath` library, providing Python-native `pathlib.Path`-like interfaces to simplify working with Databricks Workspace paths. The library includes `WorkspacePath` and `WorkspacePathDuringTest` classes offering advanced functionality for handling user home folders, relative file paths, browser URLs, and file manipulation methods such as `read/write_text()`, `read/write_bytes()`, and `glob()`. This addition brings enhanced, Pythonic ways to interact with Databricks Workspace paths, including creating and moving files, managing directories, and generating browser-accessible URIs. Additionally, the commit includes updates to existing methods and introduces new fixtures for creating notebooks, accompanied by extensive unit tests to ensure reliability and functionality. * Added propagation of `blueprint` version into `User-Agent` header when it is used as library ([#114](#114)). A new feature has been introduced in the library that allows for the propagation of the `blueprint` version and the name of the command line interface (CLI) command used in the `User-Agent` header when the library is utilized as a library. This feature includes the addition of two new pairs of `OtherInfo`: `blueprint/X.Y.Z` to indicate that the request is made using the `blueprint` library and `cmd/<name>` to store the name of the CLI command used for making the request. The implementation involves using the `with_user_agent_extra` function from `databricks.sdk.config` to set the user agent consistently with the Databricks CLI. Several changes have been made to the test file for `test_useragent.py` to include a new test case, `test_user_agent_is_propagated`, which checks if the `blueprint` version and the name of the command are correctly propagated to the `User-Agent` header. A context manager `http_fixture_server` has been added that creates an HTTP server with a custom handler, which extracts the `blueprint` version and the command name from the `User-Agent` header and stores them in the `user_agent` dictionary. The test case calls the `foo` command with a mocked `WorkspaceClient` instance and sets the `DATABRICKS_HOST` and `DATABRICKS_TOKEN` environment variables to test the propagation of the `blueprint` version and the command name in the `User-Agent` header. The test case then asserts that the `blueprint` version and the name of the command are present and correctly set in the `user_agent` dictionary. * Bump actions/checkout from 4.1.6 to 4.1.7 ([#112](#112)). In this release, the version of the "actions/checkout" action used in the `Checkout Code` step of the acceptance workflow has been updated from 4.1.6 to 4.1.7. This update may include bug fixes, performance improvements, and new features, although specific changes are not mentioned in the commit message. The `Unshallow` step remains unchanged, continuing to fetch and clean up the repository's history. This update ensures that the latest enhancements from the "actions/checkout" action are utilized, aiming to improve the reliability and performance of the code checkout process in the GitHub Actions workflow. Software engineers should be aware of this update and its potential impact on their workflows. Dependency updates: * Bump actions/checkout from 4.1.6 to 4.1.7 ([#112](#112)).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Python-native
pathlib.Path
-like interfacesThis library exposes subclasses of
pathlib
from Python's standardlibrary that work with Databricks Workspace paths. These classes provide a more intuitive and Pythonic way to work
with Databricks Workspace paths than the standard
str
paths. The classes are designed to be drop-in replacementsfor
pathlib.Path
and provide additional functionality for working with Databricks Workspace paths.[back to top]
Working With User Home Folders
This code initializes a client to interact with a Databricks workspace, creates
a relative workspace path (
~/some-folder/foo/bar/baz
), verifies the path is not absolute, and then demonstratesthat converting this relative path to an absolute path is not implemented and raises an error. Subsequently,
it expands the relative path to the user's home directory and creates the specified directory if it does not
already exist.
[back to top]
Relative File Paths
This code expands the
~
symbol to the full path of the user's home directory, computes the relative path from thishome directory to the previously created directory (
~/some-folder/foo/bar/baz
), and verifies it matches the expectedrelative path (
some-folder/foo/bar/baz
). It then confirms that the expanded path is absolute, checks thatcalling
absolute()
on this path returns the path itself, and converts the path to a FUSE-compatible pathformat (
/Workspace/[email protected]/some-folder/foo/bar/baz
).[back to top]
Browser URLs for Workspace Paths
as_uri()
method returns a browser-accessible URI for the workspace path. This example retrieves the current user's usernamefrom the Databricks workspace client, constructs a browser-accessible URI for the previously created directory
(~/some-folder/foo/bar/baz) by formatting the host URL and encoding the username, and then verifies that the URI
generated by the with_user path object matches the constructed browser URI:
[back to top]
read/write_text()
,read/write_bytes()
, andglob()
MethodsThis code creates a
WorkspacePath
object for the path~/some-folder/a/b/c
, expands it to the full user path,and creates the directory along with any necessary parent directories. It then creates a file named
hello.txt
withinthis directory, writes "Hello, World!" to it, and verifies the content. The code lists all
.txt
files in the directoryand ensures there is exactly one file, which is
hello.txt
. Finally, it deleteshello.txt
and confirms that the fileno longer exists.
read_bytes()
method works as expected:[back to top]
Moving Files
This code creates a WorkspacePath object for the path ~/some-folder, expands it to the full user path, and creates
the directory along with any necessary parent directories. It then creates a file named hello.txt within this directory
and writes "Hello, World!" to it. The code then renames the file to hello2.txt, verifies that hello.txt no longer exists,
and checks that the content of hello2.txt is "Hello, World!".
[back to top]
Working With Notebook Sources
This code initializes a Databricks WorkspaceClient, creates a WorkspacePath object for the path ~/some-folder, and
defines two items within this folder: a text file (a.txt) and a Python notebook (b). It creates the notebook with
specified content and writes "Hello, World!" to the text file. The code then retrieves all files in the folder, asserts
there are exactly two files, and verifies the suffix and content of each file. Specifically, it checks that a.txt has a
.txt suffix and b has a .py suffix, with the notebook containing the expected code.
[back to top]