-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Databricks SQL Warehouse Support to Golang Migrate #1167
base: master
Are you sure you want to change the base?
Conversation
var ( | ||
multiStmtDelimiter = []byte(";") | ||
|
||
DefaultMigrationsTable = "schema_migrations" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in my notes, we could and probably should point this at catalog_name
.schema_name.schema_migrations before merging.
return database.CasRestoreOnErr(&d.isLocked, false, true, database.ErrLocked, func() error { | ||
// Databricks SQL Warehouse does not support locking | ||
// Placeholder for actual lock code | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the SQL Warehouse might but the database driver hasn't implemented it
CREATE EXTERNAL TABLE IF NOT EXISTS `dog-park-db`.default.cat_naps ( | ||
nap_id STRING NOT NULL, -- id of the nap | ||
nap_location STRING NOT NULL, -- location where the nap took place | ||
checkpoint_id LONG NOT NULL, -- ID given to the batch per checkpoint, assigned to many process runs. | ||
batch_id STRING NOT NULL, -- ID given to each independent batch | ||
recorded_at TIMESTAMP NOT NULL -- Timestamp indicating when the nap was recorded. | ||
) LOCATION 's3://dog-park-db-tables/cat_naps'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wrote a migration that does this, should add one:
ALTER TABLE `dog-park-db`.default.cat_naps
ADD COLUMNS (
md5 STRING COMMENT 'MD5 checksum of the file content'
);
Currently, Databricks does not offer a built-in tool for deterministic schema migrations between Delta Table schemas. While schema evolution tools are available for managing changes in Delta Lake, they do not provide a controlled, additive approach to schema modifications. Given a need for precise schema management when transforming unstructured data into highly structured data within Delta Lake, a more controlled migration strategy is essential.
This PR introduces support for Databricks SQL Warehouse. This enhancement allows for precise and controlled schema management through Unity Catalog, facilitating seamless integration with both internal and external tables, such as Delta Lake or Iceberg tables. If you plan to use this, please review the Known Issues section, as there are some quirks in the implementation that need to be addressed.
Implementation Details:
Usage:
Known Issues:
This implementation was developed quickly to address immediate needs for a controlled schema migration process. It may not handle all edge cases perfectly. The primary challenges include:
hive_metastore
catalog must exist in Unity Catalog (UC) and the default schema, as the migrations table will be stored there. This may be configurable in future versions.Disclaimer: The author accepts no responsibility for any damage to personal or business systems, databases, networks, device drivers, or any other components resulting from the use of this driver.