Skip to content

Conversation

@spilchen
Copy link
Contributor

@spilchen spilchen commented Oct 21, 2024

Previously, complex column type changes that required a backfill reverted to the legacy schema changer. This update now enables support for such changes in the declarative schema changer (DSC).

As with the legacy process, a complex type change involves dropping the column being modified and adding a new one with the updated type. A temporary compute expression is used to map the old column to the new one for backfilling. The process includes performing a double backfill for the primary key (PK): first, both the old and new columns are included in the PK, and then a second backfill is done after the old column is removed. This staged approach is required due to a rule that prevents dropping the column’s compute expression before the dependent column is removed, keeping the expression until after the first backfill.

Since the old and new columns coexist temporarily, the old column is renamed to avoid ambiguity in the PK. A transient name is assigned to the dropped column (e.g., _shadow), which is automatically cleaned up later. This ensures clear differentiation during the transition. A new field was also added to ColumnName to restore the original name if the operation is undone, avoiding the use of a placeholder.

However, there are a few limitations with the current implementation:

  • No support for columns that already have a compute expression
  • No support for columns with ON UPDATE or DEFAULT expressions
  • No support for conversions to enum types
  • While column order is preserved in the catalog for SELECT * queries, the order within the column family is not maintained, with the new column being added at the end of the internal ordering.
  • Some tests, designed for the legacy schema changer, couldn’t easily be adapted for the DSC.

These limitations are planned to be addressed in future changes.

Epic: CRDB-40232
Closes #127014
Release notes: none

@spilchen spilchen self-assigned this Oct 21, 2024
@spilchen spilchen requested review from a team as code owners October 21, 2024 12:21
@spilchen spilchen requested review from rharding6373 and removed request for a team October 21, 2024 12:21
@blathers-crl
Copy link

blathers-crl bot commented Oct 21, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@spilchen spilchen force-pushed the issue-127014/general-conv-rev branch from bb18afc to e794b2c Compare October 21, 2024 13:09
Copy link
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this largely doesn't affect CDC I'm going to remove myself as a reviewer. Left one nit.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @spilchen)


pkg/ccl/changefeedccl/cdcevent/event_test.go line 601 at r1 (raw file):

	// Use alter column type to force column reordering.
	sqlDB.Exec(t, `SET enable_experimental_alter_column_type_general = true`)
	// TODO(spilchen): force the legacy schema changer. When run with the DSC,

nit: use TODO(#133040)

@spilchen spilchen force-pushed the issue-127014/general-conv-rev branch from e794b2c to 83ec1a4 Compare October 22, 2024 13:29
Copy link
Contributor

@annrpom annrpom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is exciting! nice work; i just left some questions for my understanding

also:

The process includes performing a double backfill for the primary key (PK): first, both the old and new columns are included in the PK

could you elaborate why we need to include these columns in the PK?

let me ask the question about deprules in our -eng channel and then i'll stamp

// Add the new column. It will be identical to the column it is replacing,
// except the type will differ, and it will have a transient computed expression.
// This expression will reference the original column to facilitate the backfill.
spec := addColumnSpec{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the old column is still there until the first backfill what do you think if we say "set target status of old column to drop" and then in the new column text perhaps something like "after this backfill, the old column actually gets dropped"

i do think this is implied with your explanation further down so perhaps this is just a nit

ColumnID: this.ColumnID,
Name: tabledesc.ColumnNamePlaceholder(this.ColumnID),
}
// If a name was provided for the transition to absent, override the placeholder.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for my understanding: is this just setting the new column to the column column's name?

}
},
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i guess i had the same question about deprules as you did now :-); do we want to preserve 24.3's rules and then have this be in "current"?

Copy link
Contributor

@annrpom annrpom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

also: bazel extended ci failures seem related

Previously, complex column type changes that required a backfill reverted to
the legacy schema changer. This update now enables support for such changes in
the declarative schema changer (DSC).

As with the legacy process, a complex type change involves dropping the column
being modified and adding a new one with the updated type. A temporary compute
expression is used to map the old column to the new one for backfilling. The
process includes performing a double backfill for the primary key (PK): first,
both the old and new columns are included in the PK, and then a second backfill
is done after the old column is removed. This staged approach is required due
to a rule that prevents dropping the column’s compute expression before the
dependent column is removed, keeping the expression until after the first
backfill.

Since the old and new columns coexist temporarily, the old column is renamed to
avoid ambiguity in the PK. A transient name is assigned to the dropped column
(e.g., <col>_shadow), which is automatically cleaned up later. This ensures
clear differentiation during the transition. A new field was also added to
ColumnName to restore the original name if the operation is undone, avoiding
the use of a placeholder.

However, there are a few limitations with the current implementation:
- No support for columns that already have a compute expression
- No support for columns with ON UPDATE or DEFAULT expressions
- No support for conversions to enum types
- While column order is preserved in the catalog for SELECT * queries, the
  order within the column family is not maintained, with the new column being
  added at the end of the internal ordering.

Some tests, designed for the legacy schema changer, couldn’t easily be adapted for the DSC.

These limitations are planned to be addressed in future changes.

Epic: CRDB-40232
Closes cockroachdb#127014
Release notes: none
@spilchen spilchen force-pushed the issue-127014/general-conv-rev branch from 83ec1a4 to ed20701 Compare October 28, 2024 14:51
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! nice work

Reviewed 43 of 48 files at r1, 1 of 2 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @spilchen)

@spilchen
Copy link
Contributor Author

TFTR!

bors r+

@craig craig bot merged commit ba425ba into cockroachdb:master Oct 29, 2024
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql/schemachanger: support complex ALTER COLUMN ... SET TYPE in declarative schema changer

5 participants