Sourced from databricks-labs-lsql's releases.
v0.5.0
- Added Command Execution backend which uses Command Execution API on a cluster (#95). In this release, the databricks labs lSQL library has been updated with a new Command Execution backend that utilizes the Command Execution API. A new
CommandExecutionBackend
class has been implemented, which initializes aCommandExecutor
instance taking a cluster ID, workspace client, and language as parameters. Theexecute
method runs SQL commands on the specified cluster, and thefetch
method returns the query result as an iterator of Row objects. The existingStatementExecutionBackend
class has been updated to inherit from a new abstract base class calledExecutionBackend
, which includes asave_table
method for saving data to tables and is meant to be a common base class for both Statement and Command Execution backends. TheStatementExecutionBackend
class has also been updated to use the newExecutionBackend
abstract class and its constructor now accepts amax_records_per_batch
parameter. Theexecute
andfetch
methods have been updated to use the new_only_n_bytes
method for logging truncated SQL statements. Additionally, theCommandExecutionBackend
class has several methods,execute
,fetch
, andsave_table
to execute commands on a cluster and save the results to tables in the databricks workspace. This new backend is intended to be used for executing commands on a cluster and saving the results in a databricks workspace.- Added basic integration with Lakeview Dashboards (#66). In this release, we've added basic integration with Lakeview Dashboards to the project, enhancing its capabilities. This includes updating the
databricks-labs-blueprint
dependency to version 0.4.2 with the[yaml]
extra, allowing for additional functionality related to handling YAML files. A new file,dashboards.py
, has been introduced, providing a class for interacting with Databricks dashboards, along with methods for retrieving and saving dashboard configurations. Additionally, a new__init__.py
file under thesrc/databricks/labs/lsql/lakeview
directory imports all classes and functions from themodel.py
module, providing a foundation for further development and customization. The release also introduces a new file,model.py
, containing code generated from OpenAPI specs by the Databricks SDK Generator, and a template file,model.py.tmpl
, used for handling JSON data during integration with Lakeview Dashboards. A new file,polymorphism.py
, provides utilities for checking if a value can be assigned to a specific type, supporting correct data typing and formatting with Lakeview Dashboards. Furthermore, a.gitignore
file has been added to thetests/integration
directory as part of the initial steps in adding integration testing to ensure compatibility with the Lakeview Dashboards platform. Lastly, thetest_dashboards.py
file in thetests/integration
directory contains a function,test_load_dashboard(ws)
, which uses theDashboards
class to save a dashboard from a source to a destination path, facilitating testing during the integration process.- Added dashboard-as-code functionality (#201). This commit introduces dashboard-as-code functionality for the UCX project, enabling the creation and management of dashboards using code. The feature resolves multiple issues and includes a new
create-dashboard
command for creating unpublished dashboards. The functionality is available in thelsql
lab and allows for specifying the order and width of widgets, overriding default widget identifiers, and supporting various SQL and markdown header arguments. Thedashboard.yml
file is used to define top-level metadata for the dashboard. This commit also includes extensive documentation and examples for using the dashboard as a library and configuring different options.- Automate opening integration test dashboard in debug mode (#167). A new feature has been added to automatically open the integration test dashboard in debug mode, making it easier for software engineers to debug and troubleshoot. This has been achieved by importing the
webbrowser
andis_in_debug
modules from "databricks.labs.blueprint.entrypoint", and adding a check in thecreate
function to determine if the code is running in debug mode. If it is, a dashboard URL is constructed from the workspace configuration and dashboard ID, and then opened in a web browser using "webbrowser.open". This allows for a more streamlined debugging process for the integration test dashboard. No other parts of the code have been affected by this change.- Automatically tile widgets (#109). In this release, we've introduced an automatic widget tiling feature for the dashboard creation process in our open-source library. The
Dashboards
class now includes a new class variable,_maximum_dashboard_width
, set to 6, representing the maximum width allowed for each row of widgets in the dashboard. Thecreate_dashboard
method has been updated to accept a newself
parameter, turning it into an instance method. A new_get_position
method has been introduced to calculate and return the next available position for placing a widget, and a_get_width_and_height
method has been added to return the width and height for a widget specification, initially handlingCounterSpec
instances. Additionally, we've added new unit tests to improve testing coverage, ensuring that widgets are created, positioned, and sized correctly. These tests also cover the correct positioning of widgets based on their order and available space, as well as the expected width and height for each widget.- Bump actions/checkout from 4.1.3 to 4.1.6 (#102). In the latest release, the 'actions/checkout' GitHub Action has been updated from version 4.1.3 to 4.1.6, which includes checking the platform to set the archive extension appropriately. This release also bumps the version of github/codeql-action from 2 to 3, actions/setup-node from 1 to 4, and actions/upload-artifact from 2 to 4. Additionally, the minor-actions-dependencies group was updated with two new versions. Disabling extensions.worktreeConfig when disabling sparse-checkout was introduced in version 4.1.4. The release notes and changelog for this update can be found in the provided link. This commit was made by dependabot[bot] with contributions from cory-miller and jww3.
- Bump actions/checkout from 4.1.6 to 4.1.7 (#151). In the latest release, the 'actions/checkout' GitHub action has been updated from version 4.1.6 to 4.1.7 in the project's push workflow, which checks out the repository at the start of the workflow. This change brings potential bug fixes, performance improvements, or new features compared to the previous version. The update only affects the version number in the YAML configuration for the 'actions/checkout' step in the release.yml file, with no new methods or alterations to existing functionality. This update aims to ensure a smooth and enhanced user experience for those utilizing the project's push workflows by taking advantage of the possible improvements or bug fixes in the new version of 'actions/checkout'.
- Create a dashboard with a counter from a single query (#107). In this release, we have introduced several enhancements to our dashboard-as-code approach, including the creation of a
Dashboards
class that provides methods for getting, saving, and deploying dashboards. A new method,create_dashboard
, has been added to create a dashboard with a single page containing a counter widget. The counter widget is associated with a query that counts the number of rows in a specified dataset. Thedeploy_dashboard
method has also been added to deploy the dashboard to the workspace. Additionally, we have implemented a new feature for creating dashboards with a counter from a single query, including modifications to thetest_dashboards.py
file and the addition of four new tests. These changes improve the robustness of the dashboard creation process and provide a more automated way to view important metrics.- Create text widget from markdown file (#142). A new feature has been implemented in the library that allows for the creation of a text widget from a markdown file, enhancing customization and readability for users. This development resolves issue #1
- Design document for dashboards-as-code (#105). "The latest release introduces 'Dashboards as Code,' a method for defining and managing dashboards through configuration files, enabling version control and controlled changes. The building blocks include
.sql
,.md
, anddashboard.yml
files, with.sql
defining queries and determining tile order, anddashboard.yml
specifying top-level metadata and tile overrides. Metadata can be inferred or explicitly defined in the query or files. The tile order can be determined by SQL file order,tiles
order indashboard.yml
, or SQL file metadata. This project can also be used as a library for embedding dashboard generation in your code. Configuration precedence follows command-line flags, SQL file headers,dashboard.yml
, and SQL query content. The command-line interface is utilized for dashboard generation from configuration files."- Ensure propagation of
lsql
version intoUser-Agent
header when it is used as library (#206). In this release, thepyproject.toml
file has been updated to ensure that the correct version of thelsql
library is propagated into theUser-Agent
header when used as a library, improving attribution. Thedatabricks-sdk
version has been updated from0.22.0
to0.29.0
, and the__init__.py
file of thelsql
library has been modified to add thewith_user_agent_extra
function from thedatabricks.sdk.core
package for correct attribution. Thebackends.py
file has also been updated with improved type handling in the_row_to_sql
andsave_table
functions for accurate SQL insertion and handling of user-defined classes. Additionally, a test has been added to ensure that thelsql
version is correctly propagated in theUser-Agent
header when used as a library. These changes offer improved functionality and accurate type handling, making it easier for developers to identify the library version when used in other projects.- Fixed counter encodings (#143). In this release, we have improved the encoding of counters in the lsql dashboard by modifying the
create_dashboard
function in thedashboards.py
file. Previously, the counter field encoding was hardcoded as "count," but has been changed to dynamically determine the first field name of the given fields, ensuring that counters are expected to have only one field. Additionally, a new integration test has been added to thetests/integration/test_dashboards.py
file to ensure that the dashboard deployment functionality correctly handles SQL queries that do not perform a count. A new test for theDashboards
class has also been added to check that counter field encoding names are created as expected. TheWorkspaceClient
is mocked and not called in this test. These changes enhance the accuracy of counter encoding and improve the overall functionality and reliability of the lsql dashboard.- Fixed non-existing reference and typo in the documentation (#104). In this release, we've made improvements to the documentation of our open-source library, specifically addressing issue #104. The changes include fixing a non-existent reference and a typo in the
Library size comparison
section of the "comparison.md" document. This section provides guidance for selecting a library based on factors like library size, unified authentication, and compatibility with various Databricks warehouses and SQL Python APIs. The updates clarify the required dependency size for simple applications and scripts, and offer more detailed information about each library option. We've also added a new subsection titledDetailed comparison
to provide a more comprehensive overview of each library's features. These changes are intended to help software engineers better understand which library is best suited for their specific needs, particularly for applications that require data transfer of large amounts of data serialized in Apache Arrow format and low result fetching latency, where we recommend using the Databricks SQL Connector for Python for efficient data transfer and low latency.- Fixed parsing message (#146). In this release, the warning message logged during the creation of a dashboard when a ParseError occurs has been updated to provide clearer and more detailed information about the parsing error. The new error message now includes the specific query being parsed and the exact parsing error, enabling developers to quickly identify the cause of parsing issues. This change ensures that engineers can efficiently diagnose and address parsing errors, improving the overall development and debugging experience with a more informative log format: "Parsing {query}: {error}".
- Improve dashboard as code (#108). The
Dashboards
class in the 'dashboards.py' file has been updated to improve functionality and usability, with changes such as the addition of a type variableT
for type checking and more descriptive names for methods. Thesave_to_folder
method now accepts aDashboard
object and returns aDashboard
object, and a new static methodcreate_dashboard
has been added. Additionally, two new methods_with_better_names
and_replace_names
have been added for improved readability. Theget_dashboard
method now returns aDashboard
object instead of a dictionary. Thesave_to_folder
method now also formats SQL code before saving it to file. These changes aim to enhance the functionality and readability of the codebase and provide more user-friendly methods for interacting with theDashboards
class. In addition to the changes in theDashboards
class, there have been updates in the organization of the project structure. The 'queries/counter.sql' file has been moved to 'dashboards/one_counter/counter.sql' in the 'tests/integration' directory. This modification enhances the organization of the project. Furthermore, several tests for theDashboards
class have been introduced in the 'databricks.labs.lsql.dashboards' module, demonstrating various functionalities of the class and ensuring that it functions as intended. The tests cover saving SQL and YML files to a specified folder, creating a dataset and a counter widget for each query, deploying dashboards with a given display name or dashboard ID, and testing the behavior of thesave_to_folder
anddeploy_dashboard
methods. Lastly, the commit removes thetest_load_dashboard
function and updates thetest_dashboard_creates_one_dataset_per_query
andtest_dashboard_creates_one_counter_widget_per_query
functions to use the updatedDashboard
class. A newreplace_recursively
function is introduced to replace specific fields in a dataclass recursively. A new test functiontest_dashboards_deploys_exported_dashboard_definition
has been added, which reads a dashboard definition from a JSON file, deploys it, and checks if it's successfully deployed using theDashboards
class. A new test functiontest_dashboard_deploys_dashboard_the_same_as_created_dashboard
has also been added, which compares the original and deployed dashboards to ensure they are identical. Overall, these changes aim to improve the functionality and readability of the codebase and provide more user-friendly methods for interacting with theDashboards
class, as well as enhance the organization of the project structure and add new tests for theDashboards
class to ensure it functions as intended.- Infer fields from a query (#111). The
Dashboards
class in thedashboards.py
file has been updated with the addition of a new method,_get_fields
, which accepts a SQL query as input and returns a list ofField
objects using thesqlglot
library to parse the query and extract the necessary information. Thecreate_dashboard
method has been modified to call this new function when creatingQuery
objects for each dataset. If aParseError
occurs, a warning is logged and iteration continues. This allows for the automatic population of fields when creating a new dashboard, eliminating the need for manual specification. Additionally, new tests have been added for invalid queries and for checking if the fields in a query have the expected names. These tests includetest_dashboards_skips_invalid_query
andtest_dashboards_gets_fields_with_expected_names
, which utilize the caplog fixture and create temporary query files to verify functionality. Existing functionality related to creating dashboards remains unchanged.- Make constant all caps (#140). In this release, the project's 'dashboards.py' file has been updated to improve code readability and maintainability. A constant variable
_maximum_dashboard_width
has been changed to all caps, becoming '_MAXIMUM_DASHBOARD_WIDTH'. This modification affects theDashboards
class and its methods, particularly_get_fields
and '_get_position'. The_get_position
method has been revised to use the new all caps constant variable. This change ensures better visibility of constants within the code, addressing issue #140. It's important to note that this modification only impacts the 'dashboards.py' file and does not affect any other functionalities.- Read display name from
dashboard.yml
(#144). In this release, we have introduced a newDashboardMetadata
dataclass that reads the display name of a dashboard from adashboard.yml
file located in the dashboard's directory. If thedashboard.yml
file is absent, the folder name will be used as the display name. This change improves the readability and maintainability of the dashboard configuration by explicitly defining the display name and reducing the need to specify widget information in multiple places. We have also added a new fixture calledmake_dashboard
for creating and cleaning up lakeview dashboards in the test suite. The fixture handles creation and deletion of the dashboard and provides an option to set a custom display name. Additionally, we have added and modified several unit tests to ensure the proper handling of theDashboardMetadata
class and the dashboard creation process, including tests for missing, present, or incorrectdisplay_name
keys in the YAML file. Thedashboards.deploy_dashboard()
function has been updated to handle cases where onlydashboard_id
is provided.- Set widget id in query header (#154). In this release, we've made significant improvements to widget metadata handling in our open-source library. We've introduced a new
WidgetMetadata
class that replaces the previousWidgetMetadata
dataclass, now featuring apath
attribute,spec_type
property, and optional parameters fororder
,width
,height
, and_id
. The_get_widgets
method has been updated to accept an Iterable ofWidgetMetadata
objects, and both_get_layouts
and_get_widgets
methods now sort widgets using the order field. A new class method,WidgetMetadata.from_path
, handles parsing widget metadata from a file path, replacing the removed_get_width_and_height
method. Additionally, theWidgetMetadata
class is now used in thedeploy_dashboard
method, and the test suite for thedashboards
module has been enhanced with updatedtest_widget_metadata_replaces_width_and_height
andtest_widget_metadata_replaces_attribute
functions, as well as new tests for specific scenarios. Issue #154 has been addressed by setting the widget id in the query header, and the aforementioned changes improve flexibility and ease of use for dashboard development.- Use order key in query header if defined (#149). In this release, we've introduced a new feature to use an order key in the query header if defined, enhancing the flexibility and control over the dashboard creation process. The
WidgetMetadata
dataclass now includes an optionalorder
parameter of typeint
, and the_get_arguments_parser()
method accepts the--order
flag with typeint
. Thereplace_from_arguments()
method has been updated to support the neworder
parameter, with a default value ofself.order
. Thecreate_dashboard()
method now implements a new_get_datasets()
method to retrieve datasets from the dashboard folder and introduces a_get_widgets()
method, which accepts a list of files, iterates over them, and yields tuples containing widgets and their corresponding metadata, including the order. These improvements enable the use of an order key in query headers, ensuring the correct order of widgets in the dashboard creation process. Additionally, a new test case has been added to verify the correct behavior of the dashboard deployment with a specified order key in the query header. This feature resolves issue #148.- Use widget width and height defined in query header (#147). In this release, the handling of metadata in SQL files has been updated to utilize the header of the file, instead of the first line, for improved readability and flexibility. This change includes a new WidgetMetadata class for defining the width and height of a widget in a dashboard, as well as new methods for parsing the widget metadata from a provided path. The release also includes updates to the documentation to cover the supported widget arguments
-w or --width
and '-h or --height', and resolves issue #114 by adding a test for deploying a dashboard with a big widget using a new functiontest_dashboard_deploys_dashboard_with_big_widget
. Additionally, new test cases have been added for creating dashboards with custom-sized widgets based on query header width and height values, improving functionality and error handling.Dependency updates:
Contributors:
@JCZuurmond
,@nfx
,@dependabot
[bot],@nkvuong
Sourced from databricks-labs-lsql's changelog.
0.5.0
- Added Command Execution backend which uses Command Execution API on a cluster (#95). In this release, the databricks labs lSQL library has been updated with a new Command Execution backend that utilizes the Command Execution API. A new
CommandExecutionBackend
class has been implemented, which initializes aCommandExecutor
instance taking a cluster ID, workspace client, and language as parameters. Theexecute
method runs SQL commands on the specified cluster, and thefetch
method returns the query result as an iterator of Row objects. The existingStatementExecutionBackend
class has been updated to inherit from a new abstract base class calledExecutionBackend
, which includes asave_table
method for saving data to tables and is meant to be a common base class for both Statement and Command Execution backends. TheStatementExecutionBackend
class has also been updated to use the newExecutionBackend
abstract class and its constructor now accepts amax_records_per_batch
parameter. Theexecute
andfetch
methods have been updated to use the new_only_n_bytes
method for logging truncated SQL statements. Additionally, theCommandExecutionBackend
class has several methods,execute
,fetch
, andsave_table
to execute commands on a cluster and save the results to tables in the databricks workspace. This new backend is intended to be used for executing commands on a cluster and saving the results in a databricks workspace.- Added basic integration with Lakeview Dashboards (#66). In this release, we've added basic integration with Lakeview Dashboards to the project, enhancing its capabilities. This includes updating the
databricks-labs-blueprint
dependency to version 0.4.2 with the[yaml]
extra, allowing for additional functionality related to handling YAML files. A new file,dashboards.py
, has been introduced, providing a class for interacting with Databricks dashboards, along with methods for retrieving and saving dashboard configurations. Additionally, a new__init__.py
file under thesrc/databricks/labs/lsql/lakeview
directory imports all classes and functions from themodel.py
module, providing a foundation for further development and customization. The release also introduces a new file,model.py
, containing code generated from OpenAPI specs by the Databricks SDK Generator, and a template file,model.py.tmpl
, used for handling JSON data during integration with Lakeview Dashboards. A new file,polymorphism.py
, provides utilities for checking if a value can be assigned to a specific type, supporting correct data typing and formatting with Lakeview Dashboards. Furthermore, a.gitignore
file has been added to thetests/integration
directory as part of the initial steps in adding integration testing to ensure compatibility with the Lakeview Dashboards platform. Lastly, thetest_dashboards.py
file in thetests/integration
directory contains a function,test_load_dashboard(ws)
, which uses theDashboards
class to save a dashboard from a source to a destination path, facilitating testing during the integration process.- Added dashboard-as-code functionality (#201). This commit introduces dashboard-as-code functionality for the UCX project, enabling the creation and management of dashboards using code. The feature resolves multiple issues and includes a new
create-dashboard
command for creating unpublished dashboards. The functionality is available in thelsql
lab and allows for specifying the order and width of widgets, overriding default widget identifiers, and supporting various SQL and markdown header arguments. Thedashboard.yml
file is used to define top-level metadata for the dashboard. This commit also includes extensive documentation and examples for using the dashboard as a library and configuring different options.- Automate opening integration test dashboard in debug mode (#167). A new feature has been added to automatically open the integration test dashboard in debug mode, making it easier for software engineers to debug and troubleshoot. This has been achieved by importing the
webbrowser
andis_in_debug
modules from "databricks.labs.blueprint.entrypoint", and adding a check in thecreate
function to determine if the code is running in debug mode. If it is, a dashboard URL is constructed from the workspace configuration and dashboard ID, and then opened in a web browser using "webbrowser.open". This allows for a more streamlined debugging process for the integration test dashboard. No other parts of the code have been affected by this change.- Automatically tile widgets (#109). In this release, we've introduced an automatic widget tiling feature for the dashboard creation process in our open-source library. The
Dashboards
class now includes a new class variable,_maximum_dashboard_width
, set to 6, representing the maximum width allowed for each row of widgets in the dashboard. Thecreate_dashboard
method has been updated to accept a newself
parameter, turning it into an instance method. A new_get_position
method has been introduced to calculate and return the next available position for placing a widget, and a_get_width_and_height
method has been added to return the width and height for a widget specification, initially handlingCounterSpec
instances. Additionally, we've added new unit tests to improve testing coverage, ensuring that widgets are created, positioned, and sized correctly. These tests also cover the correct positioning of widgets based on their order and available space, as well as the expected width and height for each widget.- Bump actions/checkout from 4.1.3 to 4.1.6 (#102). In the latest release, the 'actions/checkout' GitHub Action has been updated from version 4.1.3 to 4.1.6, which includes checking the platform to set the archive extension appropriately. This release also bumps the version of github/codeql-action from 2 to 3, actions/setup-node from 1 to 4, and actions/upload-artifact from 2 to 4. Additionally, the minor-actions-dependencies group was updated with two new versions. Disabling extensions.worktreeConfig when disabling sparse-checkout was introduced in version 4.1.4. The release notes and changelog for this update can be found in the provided link. This commit was made by dependabot[bot] with contributions from cory-miller and jww3.
- Bump actions/checkout from 4.1.6 to 4.1.7 (#151). In the latest release, the 'actions/checkout' GitHub action has been updated from version 4.1.6 to 4.1.7 in the project's push workflow, which checks out the repository at the start of the workflow. This change brings potential bug fixes, performance improvements, or new features compared to the previous version. The update only affects the version number in the YAML configuration for the 'actions/checkout' step in the release.yml file, with no new methods or alterations to existing functionality. This update aims to ensure a smooth and enhanced user experience for those utilizing the project's push workflows by taking advantage of the possible improvements or bug fixes in the new version of 'actions/checkout'.
- Create a dashboard with a counter from a single query (#107). In this release, we have introduced several enhancements to our dashboard-as-code approach, including the creation of a
Dashboards
class that provides methods for getting, saving, and deploying dashboards. A new method,create_dashboard
, has been added to create a dashboard with a single page containing a counter widget. The counter widget is associated with a query that counts the number of rows in a specified dataset. Thedeploy_dashboard
method has also been added to deploy the dashboard to the workspace. Additionally, we have implemented a new feature for creating dashboards with a counter from a single query, including modifications to thetest_dashboards.py
file and the addition of four new tests. These changes improve the robustness of the dashboard creation process and provide a more automated way to view important metrics.- Create text widget from markdown file (#142). A new feature has been implemented in the library that allows for the creation of a text widget from a markdown file, enhancing customization and readability for users. This development resolves issue #1
- Design document for dashboards-as-code (#105). "The latest release introduces 'Dashboards as Code,' a method for defining and managing dashboards through configuration files, enabling version control and controlled changes. The building blocks include
.sql
,.md
, anddashboard.yml
files, with.sql
defining queries and determining tile order, anddashboard.yml
specifying top-level metadata and tile overrides. Metadata can be inferred or explicitly defined in the query or files. The tile order can be determined by SQL file order,tiles
order indashboard.yml
, or SQL file metadata. This project can also be used as a library for embedding dashboard generation in your code. Configuration precedence follows command-line flags, SQL file headers,dashboard.yml
, and SQL query content. The command-line interface is utilized for dashboard generation from configuration files."- Ensure propagation of
lsql
version intoUser-Agent
header when it is used as library (#206). In this release, thepyproject.toml
file has been updated to ensure that the correct version of thelsql
library is propagated into theUser-Agent
header when used as a library, improving attribution. Thedatabricks-sdk
version has been updated from0.22.0
to0.29.0
, and the__init__.py
file of thelsql
library has been modified to add thewith_user_agent_extra
function from thedatabricks.sdk.core
package for correct attribution. Thebackends.py
file has also been updated with improved type handling in the_row_to_sql
andsave_table
functions for accurate SQL insertion and handling of user-defined classes. Additionally, a test has been added to ensure that thelsql
version is correctly propagated in theUser-Agent
header when used as a library. These changes offer improved functionality and accurate type handling, making it easier for developers to identify the library version when used in other projects.- Fixed counter encodings (#143). In this release, we have improved the encoding of counters in the lsql dashboard by modifying the
create_dashboard
function in thedashboards.py
file. Previously, the counter field encoding was hardcoded as "count," but has been changed to dynamically determine the first field name of the given fields, ensuring that counters are expected to have only one field. Additionally, a new integration test has been added to thetests/integration/test_dashboards.py
file to ensure that the dashboard deployment functionality correctly handles SQL queries that do not perform a count. A new test for theDashboards
class has also been added to check that counter field encoding names are created as expected. TheWorkspaceClient
is mocked and not called in this test. These changes enhance the accuracy of counter encoding and improve the overall functionality and reliability of the lsql dashboard.- Fixed non-existing reference and typo in the documentation (#104). In this release, we've made improvements to the documentation of our open-source library, specifically addressing issue #104. The changes include fixing a non-existent reference and a typo in the
Library size comparison
section of the "comparison.md" document. This section provides guidance for selecting a library based on factors like library size, unified authentication, and compatibility with various Databricks warehouses and SQL Python APIs. The updates clarify the required dependency size for simple applications and scripts, and offer more detailed information about each library option. We've also added a new subsection titledDetailed comparison
to provide a more comprehensive overview of each library's features. These changes are intended to help software engineers better understand which library is best suited for their specific needs, particularly for applications that require data transfer of large amounts of data serialized in Apache Arrow format and low result fetching latency, where we recommend using the Databricks SQL Connector for Python for efficient data transfer and low latency.- Fixed parsing message (#146). In this release, the warning message logged during the creation of a dashboard when a ParseError occurs has been updated to provide clearer and more detailed information about the parsing error. The new error message now includes the specific query being parsed and the exact parsing error, enabling developers to quickly identify the cause of parsing issues. This change ensures that engineers can efficiently diagnose and address parsing errors, improving the overall development and debugging experience with a more informative log format: "Parsing {query}: {error}".
- Improve dashboard as code (#108). The
Dashboards
class in the 'dashboards.py' file has been updated to improve functionality and usability, with changes such as the addition of a type variableT
for type checking and more descriptive names for methods. Thesave_to_folder
method now accepts aDashboard
object and returns aDashboard
object, and a new static methodcreate_dashboard
has been added. Additionally, two new methods_with_better_names
and_replace_names
have been added for improved readability. Theget_dashboard
method now returns aDashboard
object instead of a dictionary. Thesave_to_folder
method now also formats SQL code before saving it to file. These changes aim to enhance the functionality and readability of the codebase and provide more user-friendly methods for interacting with theDashboards
class. In addition to the changes in theDashboards
class, there have been updates in the organization of the project structure. The 'queries/counter.sql' file has been moved to 'dashboards/one_counter/counter.sql' in the 'tests/integration' directory. This modification enhances the organization of the project. Furthermore, several tests for theDashboards
class have been introduced in the 'databricks.labs.lsql.dashboards' module, demonstrating various functionalities of the class and ensuring that it functions as intended. The tests cover saving SQL and YML files to a specified folder, creating a dataset and a counter widget for each query, deploying dashboards with a given display name or dashboard ID, and testing the behavior of thesave_to_folder
anddeploy_dashboard
methods. Lastly, the commit removes thetest_load_dashboard
function and updates thetest_dashboard_creates_one_dataset_per_query
andtest_dashboard_creates_one_counter_widget_per_query
functions to use the updatedDashboard
class. A newreplace_recursively
function is introduced to replace specific fields in a dataclass recursively. A new test functiontest_dashboards_deploys_exported_dashboard_definition
has been added, which reads a dashboard definition from a JSON file, deploys it, and checks if it's successfully deployed using theDashboards
class. A new test functiontest_dashboard_deploys_dashboard_the_same_as_created_dashboard
has also been added, which compares the original and deployed dashboards to ensure they are identical. Overall, these changes aim to improve the functionality and readability of the codebase and provide more user-friendly methods for interacting with theDashboards
class, as well as enhance the organization of the project structure and add new tests for theDashboards
class to ensure it functions as intended.- Infer fields from a query (#111). The
Dashboards
class in thedashboards.py
file has been updated with the addition of a new method,_get_fields
, which accepts a SQL query as input and returns a list ofField
objects using thesqlglot
library to parse the query and extract the necessary information. Thecreate_dashboard
method has been modified to call this new function when creatingQuery
objects for each dataset. If aParseError
occurs, a warning is logged and iteration continues. This allows for the automatic population of fields when creating a new dashboard, eliminating the need for manual specification. Additionally, new tests have been added for invalid queries and for checking if the fields in a query have the expected names. These tests includetest_dashboards_skips_invalid_query
andtest_dashboards_gets_fields_with_expected_names
, which utilize the caplog fixture and create temporary query files to verify functionality. Existing functionality related to creating dashboards remains unchanged.- Make constant all caps (#140). In this release, the project's 'dashboards.py' file has been updated to improve code readability and maintainability. A constant variable
_maximum_dashboard_width
has been changed to all caps, becoming '_MAXIMUM_DASHBOARD_WIDTH'. This modification affects theDashboards
class and its methods, particularly_get_fields
and '_get_position'. The_get_position
method has been revised to use the new all caps constant variable. This change ensures better visibility of constants within the code, addressing issue #140. It's important to note that this modification only impacts the 'dashboards.py' file and does not affect any other functionalities.- Read display name from
dashboard.yml
(#144). In this release, we have introduced a newDashboardMetadata
dataclass that reads the display name of a dashboard from adashboard.yml
file located in the dashboard's directory. If thedashboard.yml
file is absent, the folder name will be used as the display name. This change improves the readability and maintainability of the dashboard configuration by explicitly defining the display name and reducing the need to specify widget information in multiple places. We have also added a new fixture calledmake_dashboard
for creating and cleaning up lakeview dashboards in the test suite. The fixture handles creation and deletion of the dashboard and provides an option to set a custom display name. Additionally, we have added and modified several unit tests to ensure the proper handling of theDashboardMetadata
class and the dashboard creation process, including tests for missing, present, or incorrectdisplay_name
keys in the YAML file. Thedashboards.deploy_dashboard()
function has been updated to handle cases where onlydashboard_id
is provided.- Set widget id in query header (#154). In this release, we've made significant improvements to widget metadata handling in our open-source library. We've introduced a new
WidgetMetadata
class that replaces the previousWidgetMetadata
dataclass, now featuring apath
attribute,spec_type
property, and optional parameters fororder
,width
,height
, and_id
. The_get_widgets
method has been updated to accept an Iterable ofWidgetMetadata
objects, and both_get_layouts
and_get_widgets
methods now sort widgets using the order field. A new class method,WidgetMetadata.from_path
, handles parsing widget metadata from a file path, replacing the removed_get_width_and_height
method. Additionally, theWidgetMetadata
class is now used in thedeploy_dashboard
method, and the test suite for thedashboards
module has been enhanced with updatedtest_widget_metadata_replaces_width_and_height
andtest_widget_metadata_replaces_attribute
functions, as well as new tests for specific scenarios. Issue #154 has been addressed by setting the widget id in the query header, and the aforementioned changes improve flexibility and ease of use for dashboard development.- Use order key in query header if defined (#149). In this release, we've introduced a new feature to use an order key in the query header if defined, enhancing the flexibility and control over the dashboard creation process. The
WidgetMetadata
dataclass now includes an optionalorder
parameter of typeint
, and the_get_arguments_parser()
method accepts the--order
flag with typeint
. Thereplace_from_arguments()
method has been updated to support the neworder
parameter, with a default value ofself.order
. Thecreate_dashboard()
method now implements a new_get_datasets()
method to retrieve datasets from the dashboard folder and introduces a_get_widgets()
method, which accepts a list of files, iterates over them, and yields tuples containing widgets and their corresponding metadata, including the order. These improvements enable the use of an order key in query headers, ensuring the correct order of widgets in the dashboard creation process. Additionally, a new test case has been added to verify the correct behavior of the dashboard deployment with a specified order key in the query header. This feature resolves issue #148.- Use widget width and height defined in query header (#147). In this release, the handling of metadata in SQL files has been updated to utilize the header of the file, instead of the first line, for improved readability and flexibility. This change includes a new WidgetMetadata class for defining the width and height of a widget in a dashboard, as well as new methods for parsing the widget metadata from a provided path. The release also includes updates to the documentation to cover the supported widget arguments
-w or --width
and '-h or --height', and resolves issue #114 by adding a test for deploying a dashboard with a big widget using a new functiontest_dashboard_deploys_dashboard_with_big_widget
. Additionally, new test cases have been added for creating dashboards with custom-sized widgets based on query header width and height values, improving functionality and error handling.Dependency updates:
0.4.3
- Bump actions/checkout from 4.1.2 to 4.1.3 (#97). The
actions/checkout
dependency has been updated from version 4.1.2 to 4.1.3 in theupdate-main-version.yml
file. This new version includes a check to verify the git version before attempting to disablesparse-checkout
, and adds an SSH user parameter to improve functionality and compatibility. The release notes and CHANGELOG.md file provide detailed information on the specific changes and improvements. The pull request also includes a detailed commit history and links to corresponding issues and pull requests on GitHub for transparency. You can review and merge the pull request to update theactions/checkout
dependency in your project.- Maintain PySpark compatibility for databricks.labs.lsql.core.Row (#99). In this release, we have added a new method
asDict
to theRow
class in thedatabricks.labs.lsql.core
module to maintain compatibility with PySpark. This method returns a dictionary representation of theRow
object, with keys corresponding to column names and values corresponding to the values in each column. Additionally, we have modified thefetch
function in thebackends.py
file to returnRow
objects ofpyspark.sql
when usingself._spark.sql(sql).collect()
. This change is temporary and marked with aTODO
comment, indicating that it will be addressed in the future. We have also added error handling code in thefetch
function to ensure the function operates as expected. TheasDict
method in this implementation simply calls the existingas_dict
method, meaning the behavior of theasDict
method is identical to theas_dict
method. Theas_dict
method returns a dictionary representation of theRow
object, with keys corresponding to column names and values corresponding to the values in each column. The optionalrecursive
argument in theasDict
method, when set toTrue
, enables recursive conversion of nestedRow
objects to nested dictionaries. However, this behavior is not currently implemented, and therecursive
argument is alwaysFalse
by default.Dependency updates:
- Bump actions/checkout from 4.1.2 to 4.1.3 (#97).
0.4.2
- Added more
NotFound
error type (#94). In the latest update, thecore.py
file in thedatabricks/labs/lsql
package has undergone enhancements to the error handling functionality. The_raise_if_needed
function has been modified to raise aNotFound
error when the error message includes the phrase "does not exist". This update enables the system to categorize specific SQL query errors asNotFound
error messages, thereby improving the overall error handling and reporting capabilities. This change was a collaborative effort, as indicated by the co-authored-by statement in the commit.0.4.1
- Fixing ovewrite integration tests (#92). A new enhancement has been implemented for the
overwrite
feature's integration tests, addressing a concern with write operations. Two new variables,catalog
and "schema", have been incorporated using theenv_or_skip
function. These variables are utilized in thesave_table
method, which is now invoked twice with the same table, once with theappend
and once with theoverwrite
option. The data in the table is retrieved and checked for accuracy after each call, employing the updatedRow
class with revised field namesfirst
and "second", formerlyname
and "id". This modification ensures the proper operation of theoverwrite
feature during integration tests and resolves any related issues. The commit messageFixing overwrite integration tests
signifies this change.0.4.0
... (truncated)
619ff0a
Release v0.5.0 (#207)4990ce1
Ensure propagation of lsql
version into
User-Agent
header when it is used...56e7f70
Add dashboard-as-code functionality (#201)f1bbf54
Automate opening integration test dashboard in debug mode (#167)a79d40f
Set widget id in query header (#154)8824273
Bump actions/checkout from 4.1.6 to 4.1.7 (#151)165594d
Use order key in query header if defined (#149)40e46e1
Use widget width and height defined in query header (#147)2a94673
Fix parsing message (#146)1575f9f
Read display name from dashboard.yml
(#144)Sourced from sqlglot's changelog.
[v25.5.0] - 2024-07-04
:boom: BREAKING CHANGES
due to
8335ba1
- preserve EXTRACT(date_part FROM datetime) calls (PR #3729 by@georgesittas
):preserve EXTRACT(date_part FROM datetime) calls (#3729)
due to
fb066a6
- Decouple NVL() from COALESCE() (PR #3734 by@VaggelisD
):Decouple NVL() from COALESCE() (#3734)
:sparkles: New Features
0c03299
- teradata: random lower upper closes #3721 (commit by@tobymao
)37b6e2d
- snowflake: add support for VECTOR(type, size) (PR #3724 by@georgesittas
)1e07c4d
- presto, trino: Configurable transpilation of Snowflake VARIANT (PR #3725 by@VaggelisD
)e5a53aa
- snowflake: Support for FROM CHANGES (PR #3731 by@VaggelisD
)
- :arrow_lower_right: addresses issue #3730 opened by
@achicoine-coveo
820d664
- presto: wrap md5 string arguments in to_utf8 (PR #3732 by@georgesittas
)
- :arrow_lower_right: addresses issue #2855 opened by
@MikeWallis42
912bc84
- spark, databricks: Support view schema binding options (PR #3739 by@VaggelisD
):bug: Bug Fixes
3454f86
- teradata: use timestamp with time zone over timestamptz (PR #3723 by@mtagle
)f4a2872
- clickhouse: switch off table alias columns generation (PR #3727 by@georgesittas
)8335ba1
- clickhouse: preserve EXTRACT(date_part FROM datetime) calls (PR #3729 by@georgesittas
)fb066a6
- oracle: Decouple NVL() from COALESCE() (PR #3734 by@VaggelisD
)
- :arrow_lower_right: fixes issue #3733 opened by
@Hal-H2Apps
c790c3b
- tsql: parse rhs of x::varchar(max) into a type (PR #3737 by@georgesittas
):recycle: Refactors
84416d2
- teradata: clean up CurrentTimestamp generation logic (commit by@georgesittas
)[v25.4.1] - 2024-06-29
:bug: Bug Fixes
[v25.4.0] - 2024-06-28
:boom: BREAKING CHANGES
... (truncated)
912bc84
feat(spark, databricks): Support view schema binding options (#3739)c790c3b
Fix(tsql): parse rhs of x::varchar(max) into a type (#3737)820d664
Feat(presto): wrap md5 string arguments in to_utf8 (#3732)fb066a6
fix(oracle)!: Decouple NVL() from COALESCE() (#3734)e5a53aa
feat(snowflake): Support for FROM CHANGES (#3731)1e07c4d
feat(presto, trino): Configurable transpilation of Snowflake VARIANT (#3725)8335ba1
Fix(clickhouse)!: preserve EXTRACT(date_part FROM datetime) calls (#3729)f4a2872
Fix(clickhouse): switch off table alias columns generation (#3727)37b6e2d
Feat(snowflake): add support for VECTOR(type, size) (#3724)84416d2
Refactor(teradata): clean up CurrentTimestamp generation logic