Skip to content

Recalculate checksums #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ jobs:

strategy:
matrix:
python-version: [3.7, 3.8, 3.9]
python-version: [3.8, 3.9]
github-runner: ['ubuntu-latest', 'windows-latest']

runs-on: ${{ matrix.github-runner }}
Expand Down
17 changes: 16 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,21 @@ All notable changes to this project will be documented in this file.

*The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).*

## [3.5.5] - 2024-02-09
### Changed
- Added `recalculate_checksum` subcommand, introducing recalculate checksum scripts

## [3.5.4] - 2024-02-07
### Changed
- Updated `snowflake-connector-python` dependency use 3.7.0. This allows to use OpenSSL 3.0
- Updated `PyYAML` to use 6.0.1
- Updated `pandas` to use 1.5.3
- Updated `Jinja2` to use 3.1.1

## [3.5.3] - 2023-02-18
### Changed
- Added `undo` subcommand, introducing Undo script

## [3.5.2] - 2023-02-14
### Changed
- Fixed bug (from the 3.5.0 release) that caused a crash when using verbose logging.
Expand All @@ -20,7 +35,7 @@ All notable changes to this project will be documented in this file.
- Cleaned up argument passing and other repetitive code using dictionary and set comparisons for easy maintenance. (Converted variable names to a consistent snake_case from a mix of kebab-case and snake_case)
- Fixed change history table processing to allow mixed case names when '"' are used in the name.
- Moved most error, log and warning messages and query strings to global or class variables.
- Updated readme to cover new authentication methods
- Updated readme to cover new authentication methods
Copy link

@BondarenkoStas BondarenkoStas Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase from CBRO-1218-undo-migrations and merge this PR into it. To later merge that branch into master.
The top commit from the branch already removes trailing whitespaces and tabs 94deeb5.

It will be easier to review this PR if it has your changes only.
Also feels unnecessary to write a bunch of logic in recalculate_checksum_command only to split it in the next commit, why not to split it initially?

Copy link
Author

@riiwo riiwo Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not to split it initially

One was feature, other was refactoring

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyways, it is rebased. I'll still point to master. You can skip first 3 commits

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you


## [3.4.2] - 2022-10-24
### Changed
Expand Down
95 changes: 86 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ For the complete list of changes made to schemachange check out the [CHANGELOG](
1. [Folder Structure](#folder-structure)
1. [Change Scripts](#change-scripts)
1. [Versioned Script Naming](#versioned-script-naming)
1. [Undo Script Naming](#undo-script-naming)
1. [Repeatable Script Naming](#repeatable-script-naming)
1. [Always Script Naming](#always-script-naming)
1. [Script Requirements](#script-requirements)
Expand Down Expand Up @@ -95,6 +96,31 @@ For example, a script name that follows this convention is: `V1.1.1__first_chang

Every script within a database folder must have a unique version number. schemachange will check for duplicate version numbers and throw an error if it finds any. This helps to ensure that developers who are working in parallel don't accidently (re-)use the same version number.

### Undo Script Naming

Undo scripts are the opposite of regular versioned scripts. An undo script is responsible for undoing the
effects of the versioned one with the same name and version. Undo migrations are optional and not required to run regular
versioned migrations.

The requirement to apply undo migrations is for them to have the same version and name as versioned scripts, for
example, having a versioned script `V1.1.1__first_change.sql`, there can be a `U1.1.1__first_change.sql` script, with a
`U` prefix rather than a `V`.

By default, the script will undo the last applied migration. To modify this behaviour, the `-s` (`--step`) option exists
accepting a number as a parameter. For example, passing `--step 3` will undo the last 3 applied versioned script, or
until it reaches one without a corresponding undo script.

e.g:

* U1.0.0__create_table.sql
* V1.0.0__create_table.sql
* V2.0.0__add_column.sql
* U3.0.0__change_name.sql
* V3.0.0__change_name.sql

By running `schemachange undo --step 3` it will only apply the `U3.0.0__change_name.sql` undo script, because the
`V2.0.0__add_column.sql` doesn't have a corresponding undo script.

### Repeatable Script Naming

Repeatable change scripts follow a similar naming convention to that used by [Flyway Versioned Migrations](https://flywaydb.org/documentation/concepts/migrations.html#repeatable-migrations). The script name must follow this pattern (image taken from [Flyway docs](https://flywaydb.org/documentation/concepts/migrations.html#repeatable-migrations):
Expand Down Expand Up @@ -217,7 +243,7 @@ Default [Password](https://docs.snowflake.com/en/user-guide/python-connector-exa
[Browser based SSO](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso) | `externalbrowser`
[Programmatic SSO](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#native-sso-okta-only) (Okta Only) | Okta URL endpoing for your Okta account typically in the form `https://<okta_account_name>.okta.com` OR `https://<okta_account_name>.oktapreview.com`

In the event both authentication criteria for the default authenticator are provided, schemachange will prioritize password authentication over key pair authentication.
In the event both authentication criteria for the default authenticator are provided, schemachange will prioritize password authentication over key pair authentication.

### Password Authentication
The Snowflake user password for `SNOWFLAKE_USER` is required to be set in the environment variable `SNOWFLAKE_PASSWORD` prior to calling the script. schemachange will fail if the `SNOWFLAKE_PASSWORD` environment variable is not set.
Expand All @@ -234,20 +260,20 @@ The URL of the authenticator resource that will be receive the POST request.
* token-response-name
The Expected name of the JSON element containing the Token in the return response from the authenticator resource.
* token-request-payload
The Set of variables passed as a dictionary to the `data` element of the request.
The Set of variables passed as a dictionary to the `data` element of the request.
* token-request-headers
The Set of variables passed as a dictionary to the `headers` element of the request.
The Set of variables passed as a dictionary to the `headers` element of the request.

It is recomended to use the YAML file and pass oauth secrets into the configuration using the templating engine instead of the command line option.
It is recomended to use the YAML file and pass oauth secrets into the configuration using the templating engine instead of the command line option.


### External Browser Authentication
External browser authentication can be used for local development by setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value `externalbrowser` prior to calling schemachange.
External browser authentication can be used for local development by setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value `externalbrowser` prior to calling schemachange.
The client will be prompted to authenticate in a browser that pops up. Refer to the [documentation](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#setting-up-browser-based-sso) to cache the token to minimize the number of times the browser pops up to authenticate the user.

### Okta Authentication
For clients that do not have a browser, can use the popular SaaS Idp option to connect via Okta. This will require the Okta URL that you utilize for SSO.
Okta authentication can be used setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value of your okta endpoint as a fully formed URL ( E.g. `https://<org_name>.okta.com`) prior to calling schemachange.
For clients that do not have a browser, can use the popular SaaS Idp option to connect via Okta. This will require the Okta URL that you utilize for SSO.
Okta authentication can be used setting the environment variable `SNOWFLAKE_AUTHENTICATOR` to the value of your okta endpoint as a fully formed URL ( E.g. `https://<org_name>.okta.com`) prior to calling schemachange.

_** NOTE**: Please disable Okta MFA for the user who uses Native SSO authentication with client drivers. Please consult your Okta administrator for more information._

Expand Down Expand Up @@ -322,14 +348,14 @@ dry-run: false
# A string to include in the QUERY_TAG that is attached to every SQL statement executed
query-tag: 'QUERY_TAG'

# Information for Oauth token requests
# Information for Oauth token requests
oauthconfig:
# url Where token request are posted to
token-provider-url: 'https://login.microsoftonline.com/{{ env_var('AZURE_ORG_GUID', 'default') }}/oauth2/v2.0/token'
# name of Json entity returned by request
token-response-name: 'access_token'
# Headers needed for successful post or other security markings ( multiple labeled items permitted
token-request-headers:
token-request-headers:
Content-Type: "application/x-www-form-urlencoded"
User-Agent: "python/schemachange"
# Request Payload for Token (it is recommended pass
Expand Down Expand Up @@ -386,6 +412,57 @@ Parameter | Description
--query-tag | A string to include in the QUERY_TAG that is attached to every SQL statement executed.
--oauth-config | Define values for the variables to Make Oauth Token requests (e.g. {"token-provider-url": "https//...", "token-request-payload": {"client_id": "GUID_xyz",...},... })'

#### undo
This subcommand is used to undo versioned scripts. It supports all parameters as `deploy` in addition to `--step` and
with the exception of `--create-change-history-table`

`usage: schemachange undo [-h] [--config-folder CONFIG_FOLDER] [-f ROOT_FOLDER] [-m MODULES_FOLDER] [-a SNOWFLAKE_ACCOUNT] [-u SNOWFLAKE_USER] [-r SNOWFLAKE_ROLE] [-w SNOWFLAKE_WAREHOUSE] [-d SNOWFLAKE_DATABASE] [-c CHANGE_HISTORY_TABLE] [--vars VARS] [-ac] [-v] [--dry-run] [--query-tag QUERY_TAG]`

Parameter | Description
--- | ---
-h, --help | Show the help message and exit
-s, --step | Amount of versioned scripts to be undone
--config-folder CONFIG_FOLDER | The folder to look in for the schemachange-config.yml file (the default is the current working directory)
-f ROOT_FOLDER, --root-folder ROOT_FOLDER | The root folder for the database change scripts. The default is the current directory.
-m MODULES_FOLDER, --modules-folder MODULES_FOLDER | The modules folder for jinja macros and templates to be used across mutliple scripts
-a SNOWFLAKE_ACCOUNT, --snowflake-account SNOWFLAKE_ACCOUNT | The name of the snowflake account (e.g. xy12345.east-us-2.azure).
-u SNOWFLAKE_USER, --snowflake-user SNOWFLAKE_USER | The name of the snowflake user
-r SNOWFLAKE_ROLE, --snowflake-role SNOWFLAKE_ROLE | The name of the role to use
-w SNOWFLAKE_WAREHOUSE, --snowflake-warehouse SNOWFLAKE_WAREHOUSE | The name of the default warehouse to use. Can be overridden in the change scripts.
-d SNOWFLAKE_DATABASE, --snowflake-database SNOWFLAKE_DATABASE | The name of the default database to use. Can be overridden in the change scripts.
-c CHANGE_HISTORY_TABLE, --change-history-table CHANGE_HISTORY_TABLE | Used to override the default name of the change history table (which is METADATA.SCHEMACHANGE.CHANGE_HISTORY)
--vars VARS | Define values for the variables to replaced in change scripts, given in JSON format (e.g. '{"variable1": "value1", "variable2": "value2"}')
-ac, --autocommit | Enable autocommit feature for DML commands. The default is 'False'.
-v, --verbose | Display verbose debugging details during execution. The default is 'False'.
--dry-run | Run schemachange in dry run mode. The default is 'False'.
--query-tag | A string to include in the QUERY_TAG that is attached to every SQL statement executed.
--oauth-config | Define values for the variables to Make Oauth Token requests (e.g. {"token-provider-url": "https//...", "token-request-payload": {"client_id": "GUID_xyz",...},... })'

#### recalculate_checksum
This subcommand is used to recalculate repeatable migration checksums. It is useful when cloning a database to ensure, that you don't need to rerun repeatable migrations.

`usage: schemachange recalculate_checksum [-h] [--config-folder CONFIG_FOLDER] [-f ROOT_FOLDER] [-m MODULES_FOLDER] [-a SNOWFLAKE_ACCOUNT] [-u SNOWFLAKE_USER] [-r SNOWFLAKE_ROLE] [-w SNOWFLAKE_WAREHOUSE] [-d SNOWFLAKE_DATABASE] [-c CHANGE_HISTORY_TABLE] [--vars VARS] [--create-change-history-table] [-ac] [-v] [--dry-run] [--query-tag QUERY_TAG]`

Parameter | Description
--- | ---
-h, --help | Show the help message and exit
--config-folder CONFIG_FOLDER | The folder to look in for the schemachange-config.yml file (the default is the current working directory)
-f ROOT_FOLDER, --root-folder ROOT_FOLDER | The root folder for the database change scripts. The default is the current directory.
-m MODULES_FOLDER, --modules-folder MODULES_FOLDER | The modules folder for jinja macros and templates to be used across mutliple scripts
-a SNOWFLAKE_ACCOUNT, --snowflake-account SNOWFLAKE_ACCOUNT | The name of the snowflake account (e.g. xy12345.east-us-2.azure).
-u SNOWFLAKE_USER, --snowflake-user SNOWFLAKE_USER | The name of the snowflake user
-r SNOWFLAKE_ROLE, --snowflake-role SNOWFLAKE_ROLE | The name of the role to use
-w SNOWFLAKE_WAREHOUSE, --snowflake-warehouse SNOWFLAKE_WAREHOUSE | The name of the default warehouse to use. Can be overridden in the change scripts.
-d SNOWFLAKE_DATABASE, --snowflake-database SNOWFLAKE_DATABASE | The name of the default database to use. Can be overridden in the change scripts.
-c CHANGE_HISTORY_TABLE, --change-history-table CHANGE_HISTORY_TABLE | Used to override the default name of the change history table (which is METADATA.SCHEMACHANGE.CHANGE_HISTORY)
--vars VARS | Define values for the variables to replaced in change scripts, given in JSON format (e.g. '{"variable1": "value1", "variable2": "value2"}')
--create-change-history-table | Create the change history table if it does not exist. The default is 'False'.
-ac, --autocommit | Enable autocommit feature for DML commands. The default is 'False'.
-v, --verbose | Display verbose debugging details during execution. The default is 'False'.
--dry-run | Run schemachange in dry run mode. The default is 'False'.
--query-tag | A string to include in the QUERY_TAG that is attached to every SQL statement executed.
--oauth-config | Define values for the variables to Make Oauth Token requests (e.g. {"token-provider-url": "https//...", "token-request-payload": {"client_id": "GUID_xyz",...},... })'

#### render
This subcommand is used to render a single script to the console. It is intended to support the development and troubleshooting of script that use features from the jinja template engine.

Expand Down
6 changes: 6 additions & 0 deletions demo/citibike/U1.2__load_tables_from_s3.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
-- Set the database and schema context
USE SCHEMA SCHEMACHANGE_DEMO.PUBLIC;

TRUNCATE TABLE TRIPS;

TRUNCATE TABLE WEATHER;
8 changes: 4 additions & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Jinja2==3.0.0
pandas==1.3.0
PyYAML==5.4
snowflake-connector-python==2.8.0
Jinja2~=3.0
pandas~=1.3
PyYAML~=6.0
snowflake-connector-python>=2.8,<4.0
Loading