Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrator: allow removing settings in migrations #644

Merged
merged 4 commits into from
Feb 1, 2020

Conversation

tjkirch
Copy link
Contributor

@tjkirch tjkirch commented Jan 13, 2020

commit e641511f2274618e1c7390817679fb8857538694
Author: Tom Kirchner <[email protected]>
Date:   Tue Jan 21 13:04:14 2020 -0800

migration-helpers: back-compat: remove datastore keys removed in migration

The old migrator interface used a single data store as source and target.
After a migration changes the MigrationData, the resulting values are written
out to the data store.  Since this data store also served as the source, it had
all of the old keys on disk already, and nothing explicitly removed them; we'd
only see new keys overwrite old.

To allow downgrading to versions with the old migrator interface, we need
migrations to explicitly remove keys from the data store.  This change to
migration-helpers detects keys that were removed from MigrationData by a
migration and tells the data store to explicitly remove them.  As long as
migrations are rebuilt against this library, they'll work in the old or new
migrator.

This workaround can be removed when we no longer support versions with the old
migrator interface.
commit 30271c76ad8026619fc1163cbb858f7883bb4319
Author: Tom Kirchner <[email protected]>
Date:   Tue Jan 21 12:54:50 2020 -0800

apiserver: add DataStore methods to unset keys and metadata

We've gotten by without this because settings are effectively all required, but
there are now some advanced cases (migrator interface change) that require
removing keys from an existing data store.  This probably shouldn't be exposed
in a higher-level API, though, unless we account for optional settings.
commit 145c766f623345448ea9d6523b2552069ad46086
Author: Tom Kirchner <[email protected]>
Date:   Tue Jan 21 12:53:46 2020 -0800

apiserver: make MemoryDataStore public

This allows for DataStore-based testing in other crates.
commit d3826cf1bec9b5989a12218985b4a7d1b49eaa7f
Author: Tom Kirchner <[email protected]>
Date:   Mon Jan 13 11:34:03 2020 -0800

migrator: allow removing settings in migrations

Migrations weren't able to remove settings because we started with a copy of
the existing data store and wrote out any changes key-by-key, meaning a
"removal" would just leave the old copied key in place on disk.

With this change, we no longer copy the data store, we instead give migrations
the source data store and the target data store, and (via migration-helpers)
the input data is read from the source instead of a copy.  This is more
efficient because it avoids a needless copy, and allows the migration to

Builds on #638.
Fixes #641.

Testing done:

See #643 for testing of the first commit.

The following three commits include unit tests, which pass. In addition:

  • I rebuilt our migrations with the new migration-helpers.
  • I checked out an older source tree without any of these changes, i.e. it had the old migrator interface.
  • I ran the migrations through and saw them delete a setting, which they were not able to do before:

Migrate up to 0.2

$ cargo run -- --log-level trace --datastore-path datastore-sample/current --migration-directories ./migration-binaries --migrate-to-version 0.2
...
22:09:14 [ INFO] Flipping /home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0 to point to v0.2

Other, normal migration worked:

$ cat datastore-sample/current/live/configuration-files/containerd-config-toml/template-path; echo
"/usr/share/templates/containerd-config-toml_aws-k8s"

Add region, which would be generated by #638:

$ cat datastore-sample/current/live/settings/aws/region; echo
"us-west-2"

Migrate, which should remove it:

$ cargo run -- --log-level trace --datastore-path datastore-sample/current --migration-directories ./migration-binaries --migrate-to-version 0.1
...
22:09:47 [DEBUG] (1) migrator: Sorted migrations: ["migrate_v0.2_remove-region", "migrate_v0.2_containerd-config-path"]
22:09:47 [ INFO] Copying datastore from /home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0.2_BZxbOtFAUIrBnFCG to work location /home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0.1_lQ1OHfeqTXDNKqea
22:09:47 [ INFO] Running migration command: "./migration-binaries/migrate_v0.2_remove-region" "--backward" "--datastore-path" "/home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0.1_lQ1OHfeqTXDNKqea"
22:09:47 [DEBUG] (1) migrator: Migration stdout: Removed settings.aws.region, which was set to '"us-west-2"'
Found no settings.aws.region to remove

22:09:47 [DEBUG] (1) migrator: Migration stderr: 
22:09:47 [ INFO] Running migration command: "./migration-binaries/migrate_v0.2_containerd-config-path" "--backward" "--datastore-path" "/home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0.1_lQ1OHfeqTXDNKqea"
22:09:47 [DEBUG] (1) migrator: Migration stdout: 
22:09:47 [DEBUG] (1) migrator: Migration stderr: 
22:09:47 [ INFO] Flipping /home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0.1 to point to v0.1_lQ1OHfeqTXDNKqea
22:09:47 [ INFO] Flipping /home/ANT.AMAZON.COM/tjk/work/thar/workspaces/api/migration/migrator/datastore-sample/v0 to point to v0.1

Other, normal migration still worked:

$ cat datastore-sample/current/live/configuration-files/containerd-config-toml/template-path; echo
"/usr/share/templates/containerd-config-toml"

Setting is now gone, which (I double checked) doesn't work with a migration built with older migration-helpers:

$ cat datastore-sample/current/live/settings/aws/region; echo
cat: datastore-sample/current/live/settings/aws/region: No such file or directory

@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 13, 2020

Note: I updated the commit message, removing the following paragraph:

Migrations using migration-helpers (all current migrations) do not need any
changes to their source code, since migration-helpers handles the new
arguments.  However, they do need to be recompiled.

This was a bit misleading. The problem is old migrations being run through the new migrator. This (#643) is the only migration since our last release, so the problem would be users upgrading through more than one release. For example, someone updating from 0.1.5 to 0.2.1 (without any changes on our part) would attempt to run the migrations for 0.1.5 to 0.1.6 and 0.1.6 to 0.2.0, which were compiled without this new syntax, and then this migration from 0.2.0 to 0.2.1, which was compiled with this new syntax. That wouldn't work, because the older migrations would fail and the upgrade would halt.

We can:

  • take advantage of the upgrade manifest's capability to list distinct upgrade paths, so that users coming from a version before 0.2.0 get recompiled versions of the migrations that work on 0.2.1
  • say that users running versions before 0.2.0 have to upgrade through 0.2.0 first.

@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 14, 2020

take advantage of the upgrade manifest's capability to list distinct upgrade paths, so that users coming from a version before 0.2.0 get recompiled versions of the migrations that work on 0.2.1

This is the approach we chose as a team. I'll be working on modifying our existing migrations to support both migrator interfaces, as a workaround for the interface change. (Future migrations won't require this.)

I'd still like to get this approved so we're otherwise ready to go.

etungsten
etungsten previously approved these changes Jan 14, 2020
@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 21, 2020

This push adds commits that allow migrations that remove settings to work even when downgrading to OS versions with the old migration interface, as long as we compile the migrations with the updated migration-helpers library.

It's best to look at the individual commits to understand the changes - see the updated description with the commit messages.

(The only change to the single existing commit was a missed cargo fmt.)

@tjkirch tjkirch requested a review from bcressey January 21, 2020 22:30
@tjkirch tjkirch dismissed etungsten’s stale review January 21, 2020 22:30

Major changes, requesting re-review

@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 23, 2020

This push just rebases on the changes from #638.

@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 28, 2020

This push is just a rebase onto the changes from #637.

@tjkirch tjkirch requested review from bcressey and zmrow and removed request for bcressey and zmrow January 28, 2020 17:31
@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 28, 2020

This push just rebases on a typo fix from #636.

@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 29, 2020

This push is just a rebase onto changes from #637.

Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️

@tjkirch
Copy link
Contributor Author

tjkirch commented Jan 31, 2020

This push addresses @bcressey's concerns by (1) removing a comment and (2) deleting empty directories after removing keys.

(2) was a bit hard to test because it can't happen in our system, but I pulled that function out into a separate binary and confirmed it works as documented.

@tjkirch
Copy link
Contributor Author

tjkirch commented Feb 1, 2020

This push just fixes a typo in a comment that bcressey mentioned offline.

@tjkirch tjkirch changed the base branch from parameterize-region to develop February 1, 2020 00:24
Migrations weren't able to remove settings because we started with a copy of
the existing data store and wrote out any changes key-by-key, meaning a
"removal" would just leave the old copied key in place on disk.

With this change, we no longer copy the data store, we instead give migrations
the source data store and the target data store, and (via migration-helpers)
the input data is read from the source instead of a copy.  This is more
efficient because it avoids a needless copy, and allows the migration to
determine exactly what's left after it runs.
This allows for DataStore-based testing in other crates.
We've gotten by without this because settings are effectively all required, but
there are now some advanced cases (migrator interface change) that require
removing keys from an existing data store.  This probably shouldn't be exposed
in a higher-level API, though, unless we account for optional settings.
…ation

The old migrator interface used a single data store as source and target.
After a migration changes the MigrationData, the resulting values are written
out to the data store.  Since this data store also served as the source, it had
all of the old keys on disk already, and nothing explicitly removed them; we'd
only see new keys overwrite old.

To allow downgrading to versions with the old migrator interface, we need
migrations to explicitly remove keys from the data store.  This change to
migration-helpers detects keys that were removed from MigrationData by a
migration and tells the data store to explicitly remove them.  As long as
migrations are rebuilt against this library, they'll work in the old or new
migrator.

This workaround can be removed when we no longer support versions with the old
migrator interface.
@tjkirch
Copy link
Contributor Author

tjkirch commented Feb 1, 2020

Just a rebase on develop to get the fix from #637, no changes.

@tjkirch tjkirch merged commit 3ae9532 into develop Feb 1, 2020
@tjkirch tjkirch deleted the migrator-remove-setting branch February 1, 2020 00:36
tjkirch added a commit that referenced this pull request Feb 21, 2020
We found confusion between data store versions and OS versions, so this
simplifies the system to only use OS version.  If there's been no change in the
data store, which is now determined by there being no migrations, then the new
data store will simply be a link to the last version that did have changes.

These are the changes at a high level:
* Remove the data_store_version library and its type, instead using semver::Version
  * Cascade this change through migrator, storewolf, update_metadata, and updog
* Remove OS version -> data store version mapping in updog
* Remove the data-store-version file, instead using OS version from bottlerocket-release
* In migrator, just symlink to previous data store if there are no migrations
* In migrator, flip symlinks for current, major, minor, and patch, since any could change
* In storewolf, also create the new patch-level symlink

One change deserves more description.  It's an extension of the problem from #644
that only showed up while testing this change, and was made easier to fix by
these changes, so it was done together.

Background: In migrator, we were giving each migration the same source and
target data store.  Migrations make whatever changes they want to a
MigrationData, each key of which is then written to the data store.

Problem: Let's say the system has settings A and B.  One migration adds setting
C.  This is written to the target data store.  Another migration tries to
remove setting B.  It does so in memory, but B has already been written to the
target by the first migration, so it's to no avail.

Solution: Chain the input and output of migrations in order.  The first
migration receives the source data store and adds setting C, writing out to a
*new* data store path it was given.  The second migration is given that new
data store as a source; it removes setting B and writes out to another new data
store path it was given.  At the end of the chain, the final data store is kept
and the intermediate ones are removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrations can't remove settings
4 participants