Commit 4b35282
[SPARK-50883][SQL] Support altering multiple columns in the same command
### What changes were proposed in this pull request?
We propose the following new syntax for altering multiple columns at the same time:
```
ALTER TABLE table_name ALTER COLUMN {
{ column_identifier | field_name }
{ COMMENT comment |
{ FIRST | AFTER identifier } |
{ SET | DROP } NOT NULL |
TYPE data_type |
SET DEFAULT clause |
DROP DEFAULT }
} [, ...]
```
For example:
```
ALTER TABLE test_table ALTER COLUMN
a COMMENT "new comment",
b TYPE BIGINT,
x.y.z FIRST
```
This new syntax is backward compatible with the current syntax. To bound the complexity of the initial support of this syntax we place the following restrictions:
+ Altering the same column multiple times is not allowed
+ Altering a parent and a child column (for nested data type) is not allowed.
+ Altering v1 tables with this new syntax is not allowed.
In terms of implementation, we modify the current `AlterColumn` logical plan to be `AlterColumns` that can take in multiple columns and `AlterColumnSpec`s.
All `AlterColumnSpec`s are checked during analyzing phase, so if one of them is invalid (e.g., non-existent column, wrong type conversion, etc), the entire command will fail.
The `AlterColumnSpec`s are transformed into `TableChange`s, which are passed to the `TableCatalog::alterTable` method. Therefore, the semantics of this new command (atomic vs non-atomic) depends on the implementation of this method.
The `V2SessionCatalog::alterTable` currently applies all table changes to the catalog table and then send to the catalog in one request. As a result, column changes are by default applied to the catalog (HMS) atomically: either all changes are made or none are.
For example, the above command produces the following plans:
```
== Physical Plan ==
AlterTable org.apache.spark.sql.delta.catalog.DeltaCatalog6d89c923, default.test_table, [org.apache.spark.sql.connector.catalog.TableChange$UpdateColumnCommentff58ec42, org.apache.spark.sql.connector.catalog.TableChange$UpdateColumnType7e7c730c, org.apache.spark.sql.connector.catalog.TableChange$UpdateColumnPositionbc842915]
== Parsed Logical Plan ==
'AlterColumns [unresolvedfieldname(a), unresolvedfieldname(b), unresolvedfieldname(x, y, z)], [AlterColumnSpec(None,None,Some(new comment),None,None), AlterColumnSpec(Some(LongType),None,None,None,None), AlterColumnSpec(None,None,None,Some(unresolvedfieldposition(FIRST)),None)]
+- 'UnresolvedTable [test_table], ALTER TABLE ... ALTER COLUMN
== Analyzed Logical Plan ==
AlterColumns [resolvedfieldname(StructField(a,IntegerType,true)), resolvedfieldname(StructField(b,IntegerType,true)), resolvedfieldname(x, y, StructField(z,IntegerType,true))], [AlterColumnSpec(None,None,Some(new comment),None,None), AlterColumnSpec(Some(LongType),None,None,None,None), AlterColumnSpec(None,None,None,Some(resolvedfieldposition(FIRST)),None)]
+- ResolvedTable org.apache.spark.sql.delta.catalog.DeltaCatalog6d89c923, default.test_table, DeltaTableV2(...)),Some(default.test_table),None,Map()), [a#163, b#164, x#165]
== Physical Plan ==
AlterTable org.apache.spark.sql.delta.catalog.DeltaCatalog6d89c923, default.test_table, [org.apache.spark.sql.connector.catalog.TableChange$UpdateColumnCommentff58ec42, org.apache.spark.sql.connector.catalog.TableChange$UpdateColumnType7e7c730c, org.apache.spark.sql.connector.catalog.TableChange$UpdateColumnPositionbc842915]
```
### Why are the changes needed?
The current ALTER TABLE ... ALTER COLUMN syntax allows altering only one column at a time. For a large table with many columns, a command must be run for each column, which can be slow due to the repeated preprocessing and I/O costs. A new syntax that enables specifying multiple columns could allow these costs to be shared across multiple column changes.
### Does this PR introduce _any_ user-facing change?
Yes
### How was this patch tested?
New unit tests
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #49559 from ctring/bulk-alter-column.
Authored-by: Cuong Nguyen <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>1 parent 917a9a1 commit 4b35282
File tree
14 files changed
+547
-291
lines changed- common/utils/src/main/resources/error
- sql
- api/src/main/antlr4/org/apache/spark/sql/catalyst/parser
- catalyst/src
- main/scala/org/apache/spark/sql/catalyst
- analysis
- parser
- plans/logical
- test/scala/org/apache/spark/sql/catalyst/parser
- core/src
- main/scala/org/apache/spark/sql/catalyst/analysis
- test/scala/org/apache/spark/sql
- connector
- execution/command
14 files changed
+547
-291
lines changedLines changed: 6 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3962 | 3962 | | |
3963 | 3963 | | |
3964 | 3964 | | |
| 3965 | + | |
| 3966 | + | |
| 3967 | + | |
| 3968 | + | |
| 3969 | + | |
| 3970 | + | |
3965 | 3971 | | |
3966 | 3972 | | |
3967 | 3973 | | |
| |||
Lines changed: 9 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
212 | 212 | | |
213 | 213 | | |
214 | 214 | | |
215 | | - | |
216 | | - | |
| 215 | + | |
217 | 216 | | |
218 | 217 | | |
219 | 218 | | |
| |||
1489 | 1488 | | |
1490 | 1489 | | |
1491 | 1490 | | |
| 1491 | + | |
| 1492 | + | |
| 1493 | + | |
| 1494 | + | |
| 1495 | + | |
| 1496 | + | |
| 1497 | + | |
| 1498 | + | |
1492 | 1499 | | |
1493 | 1500 | | |
1494 | 1501 | | |
| |||
Lines changed: 22 additions & 17 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3847 | 3847 | | |
3848 | 3848 | | |
3849 | 3849 | | |
3850 | | - | |
3851 | | - | |
3852 | | - | |
3853 | | - | |
3854 | | - | |
3855 | | - | |
3856 | | - | |
3857 | | - | |
3858 | | - | |
3859 | | - | |
3860 | | - | |
3861 | | - | |
3862 | | - | |
3863 | | - | |
3864 | | - | |
3865 | | - | |
| 3850 | + | |
| 3851 | + | |
| 3852 | + | |
| 3853 | + | |
| 3854 | + | |
| 3855 | + | |
| 3856 | + | |
| 3857 | + | |
| 3858 | + | |
| 3859 | + | |
| 3860 | + | |
| 3861 | + | |
| 3862 | + | |
| 3863 | + | |
| 3864 | + | |
| 3865 | + | |
| 3866 | + | |
| 3867 | + | |
| 3868 | + | |
| 3869 | + | |
| 3870 | + | |
3866 | 3871 | | |
3867 | | - | |
| 3872 | + | |
3868 | 3873 | | |
3869 | 3874 | | |
3870 | 3875 | | |
| |||
Lines changed: 69 additions & 47 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1622 | 1622 | | |
1623 | 1623 | | |
1624 | 1624 | | |
1625 | | - | |
1626 | | - | |
1627 | | - | |
1628 | | - | |
1629 | | - | |
1630 | | - | |
1631 | | - | |
1632 | | - | |
1633 | | - | |
1634 | | - | |
1635 | | - | |
1636 | | - | |
1637 | | - | |
1638 | | - | |
1639 | | - | |
1640 | | - | |
1641 | | - | |
1642 | | - | |
1643 | | - | |
1644 | | - | |
1645 | | - | |
1646 | | - | |
1647 | | - | |
1648 | | - | |
1649 | | - | |
1650 | | - | |
1651 | | - | |
1652 | | - | |
1653 | | - | |
1654 | | - | |
1655 | | - | |
1656 | | - | |
1657 | | - | |
1658 | | - | |
1659 | | - | |
1660 | | - | |
1661 | | - | |
| 1625 | + | |
| 1626 | + | |
| 1627 | + | |
| 1628 | + | |
1662 | 1629 | | |
1663 | | - | |
| 1630 | + | |
1664 | 1631 | | |
1665 | 1632 | | |
1666 | | - | |
1667 | | - | |
1668 | | - | |
1669 | | - | |
1670 | | - | |
| 1633 | + | |
1671 | 1634 | | |
1672 | | - | |
1673 | | - | |
| 1635 | + | |
| 1636 | + | |
1674 | 1637 | | |
1675 | | - | |
1676 | | - | |
| 1638 | + | |
| 1639 | + | |
| 1640 | + | |
| 1641 | + | |
1677 | 1642 | | |
1678 | 1643 | | |
| 1644 | + | |
| 1645 | + | |
| 1646 | + | |
| 1647 | + | |
| 1648 | + | |
| 1649 | + | |
| 1650 | + | |
| 1651 | + | |
| 1652 | + | |
| 1653 | + | |
| 1654 | + | |
| 1655 | + | |
| 1656 | + | |
| 1657 | + | |
| 1658 | + | |
| 1659 | + | |
| 1660 | + | |
| 1661 | + | |
| 1662 | + | |
| 1663 | + | |
| 1664 | + | |
| 1665 | + | |
| 1666 | + | |
| 1667 | + | |
| 1668 | + | |
| 1669 | + | |
| 1670 | + | |
| 1671 | + | |
| 1672 | + | |
| 1673 | + | |
| 1674 | + | |
| 1675 | + | |
| 1676 | + | |
| 1677 | + | |
| 1678 | + | |
| 1679 | + | |
| 1680 | + | |
| 1681 | + | |
| 1682 | + | |
| 1683 | + | |
| 1684 | + | |
| 1685 | + | |
| 1686 | + | |
| 1687 | + | |
| 1688 | + | |
| 1689 | + | |
| 1690 | + | |
| 1691 | + | |
| 1692 | + | |
| 1693 | + | |
| 1694 | + | |
| 1695 | + | |
| 1696 | + | |
| 1697 | + | |
| 1698 | + | |
| 1699 | + | |
| 1700 | + | |
1679 | 1701 | | |
1680 | 1702 | | |
1681 | 1703 | | |
| |||
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultStringTypes.scala
Lines changed: 9 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
96 | | - | |
| 96 | + | |
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
118 | | - | |
119 | | - | |
120 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
121 | 125 | | |
122 | 126 | | |
123 | 127 | | |
| |||
0 commit comments