-
Notifications
You must be signed in to change notification settings - Fork 4k
sql/schemachanger: support complex ALTER COLUMN … SET TYPE in DSC #133047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/schemachanger: support complex ALTER COLUMN … SET TYPE in DSC #133047
Conversation
|
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
bb18afc to
e794b2c
Compare
rharding6373
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this largely doesn't affect CDC I'm going to remove myself as a reviewer. Left one nit.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @spilchen)
pkg/ccl/changefeedccl/cdcevent/event_test.go line 601 at r1 (raw file):
// Use alter column type to force column reordering. sqlDB.Exec(t, `SET enable_experimental_alter_column_type_general = true`) // TODO(spilchen): force the legacy schema changer. When run with the DSC,
nit: use TODO(#133040)
e794b2c to
83ec1a4
Compare
annrpom
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is exciting! nice work; i just left some questions for my understanding
also:
The process includes performing a double backfill for the primary key (PK): first, both the old and new columns are included in the PK
could you elaborate why we need to include these columns in the PK?
let me ask the question about deprules in our -eng channel and then i'll stamp
| // Add the new column. It will be identical to the column it is replacing, | ||
| // except the type will differ, and it will have a transient computed expression. | ||
| // This expression will reference the original column to facilitate the backfill. | ||
| spec := addColumnSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the old column is still there until the first backfill what do you think if we say "set target status of old column to drop" and then in the new column text perhaps something like "after this backfill, the old column actually gets dropped"
i do think this is implied with your explanation further down so perhaps this is just a nit
| ColumnID: this.ColumnID, | ||
| Name: tabledesc.ColumnNamePlaceholder(this.ColumnID), | ||
| } | ||
| // If a name was provided for the transition to absent, override the placeholder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for my understanding: is this just setting the new column to the column column's name?
| } | ||
| }, | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i guess i had the same question about deprules as you did now :-); do we want to preserve 24.3's rules and then have this be in "current"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, complex column type changes that required a backfill reverted to the legacy schema changer. This update now enables support for such changes in the declarative schema changer (DSC). As with the legacy process, a complex type change involves dropping the column being modified and adding a new one with the updated type. A temporary compute expression is used to map the old column to the new one for backfilling. The process includes performing a double backfill for the primary key (PK): first, both the old and new columns are included in the PK, and then a second backfill is done after the old column is removed. This staged approach is required due to a rule that prevents dropping the column’s compute expression before the dependent column is removed, keeping the expression until after the first backfill. Since the old and new columns coexist temporarily, the old column is renamed to avoid ambiguity in the PK. A transient name is assigned to the dropped column (e.g., <col>_shadow), which is automatically cleaned up later. This ensures clear differentiation during the transition. A new field was also added to ColumnName to restore the original name if the operation is undone, avoiding the use of a placeholder. However, there are a few limitations with the current implementation: - No support for columns that already have a compute expression - No support for columns with ON UPDATE or DEFAULT expressions - No support for conversions to enum types - While column order is preserved in the catalog for SELECT * queries, the order within the column family is not maintained, with the new column being added at the end of the internal ordering. Some tests, designed for the legacy schema changer, couldn’t easily be adapted for the DSC. These limitations are planned to be addressed in future changes. Epic: CRDB-40232 Closes cockroachdb#127014 Release notes: none
83ec1a4 to
ed20701
Compare
rafiss
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! nice work
Reviewed 43 of 48 files at r1, 1 of 2 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @spilchen)
|
TFTR! bors r+ |
Previously, complex column type changes that required a backfill reverted to the legacy schema changer. This update now enables support for such changes in the declarative schema changer (DSC).
As with the legacy process, a complex type change involves dropping the column being modified and adding a new one with the updated type. A temporary compute expression is used to map the old column to the new one for backfilling. The process includes performing a double backfill for the primary key (PK): first, both the old and new columns are included in the PK, and then a second backfill is done after the old column is removed. This staged approach is required due to a rule that prevents dropping the column’s compute expression before the dependent column is removed, keeping the expression until after the first backfill.
Since the old and new columns coexist temporarily, the old column is renamed to avoid ambiguity in the PK. A transient name is assigned to the dropped column (e.g., _shadow), which is automatically cleaned up later. This ensures clear differentiation during the transition. A new field was also added to ColumnName to restore the original name if the operation is undone, avoiding the use of a placeholder.
However, there are a few limitations with the current implementation:
These limitations are planned to be addressed in future changes.
Epic: CRDB-40232
Closes #127014
Release notes: none