Make rowid optional in DELETE query plan#25284
Conversation
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
d5f2c56 to
f2a7d94
Compare
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
f2a7d94 to
40f0bf8
Compare
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
40f0bf8 to
8436e06
Compare
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
8436e06 to
9372e22
Compare
| * {@link com.facebook.presto.spi.UpdatablePageSource} that created them. | ||
| */ | ||
| default ColumnHandle getDeleteRowIdColumnHandle(ConnectorSession session, ConnectorTableHandle tableHandle) | ||
| default Optional<ColumnHandle> getDeleteRowIdColumnHandle(ConnectorSession session, ConnectorTableHandle tableHandle) |
There was a problem hiding this comment.
One little question: should we consider backward compatibility for potential externally-implemented connectors?
There was a problem hiding this comment.
Do you have a preference for how you would want to support this instead?
I suppose another option would be to add a guard method to the interface that could disable the behavior:
boolean requiresDeleteRowIdColumnHandle()
That seems kind of cumbersome to have 2 methods for both the update and delete rowId handling. I thought that the Optional usage would encapsulate this better.
Or we could add a system session property that QueryPlanner checks before injecting the rowId. That would avoid the ConnectorMetadata API change. But that doesn't seem quite as clean, as whether or not the rowId column is required is more a function/requirement of the connector implementation. So you could theoretically be using different connectors with different requirements.
I'm open to any other suggestions.
There was a problem hiding this comment.
If we need to ensure backward compatibility for this Connector SPI change, the modification should not affect any potentially existing external connectors that may have already implemented the modified SPI methods getDeleteRowIdColumnHandle or getUpdateRowIdColumnHandle.
IMO, we can refer to a previous SPI method modification that took backward compatibility into account, see here. In this approach, we retain the original SPI methods but mark them as @deprecated, while adding new SPI methods with default implementations that internally call the legacy methods. Meanwhile, the engine's internal implementation logic now exclusively uses new SPI methods. This ensures that the SPI changes remain backward-compatible, leaving all existing connectors unaffected.
One minor issue with this approach is that the methods need to maintain distinct signatures, which may require using different method names for the new ones. The code would roughly look like this:
@Deprecated
default ColumnHandle getUpdateRowIdColumnHandle(ConnectorSession session, ConnectorTableHandle tableHandle, List<ColumnHandle> updatedColumns)
{
throw new PrestoException(NOT_SUPPORTED, "This connector does not support updates");
}
default Optional<ColumnHandle> getUpdateRowIdColumn(ConnectorSession session, ConnectorTableHandle tableHandle, List<ColumnHandle> updatedColumns)
{
ColumnHandle columnHandle = getUpdateRowIdColumnHandle(session, tableHandle, updatedColumns);
return Optional.ofNullable(columnHandle);
}
In this way, all existing connectors (especially external-implemented connectors which we can not modify directly) would not be affected by this SPI interface change. And we can remove these deprecated SPI methods in some major version upgrade that breaks backward compatibility.
What's your opinion about this?
There was a problem hiding this comment.
Thanks for the reply. Yeah, that approach makes sense. I can make that update.
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
9372e22 to
2db3183
Compare
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
Summary: Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
2db3183 to
90b4317
Compare
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
90b4317 to
83dae3c
Compare
Summary: Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
995d3d9 to
226fb4d
Compare
Summary: Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
226fb4d to
05cba2c
Compare
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
05cba2c to
fba6f9a
Compare
Summary: Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
fba6f9a to
29d8a0a
Compare
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
29d8a0a to
b680aab
Compare
|
@hantangwangd would you mind taking another look? The FB internal failing test is due to a corresponding internal code change needed. |
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks for the refactoring, I have verified it on Iceberg V1 tables that only support metadata deletion. Looks good to me, just some little nits.
presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionTransforms.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/sql/planner/QueryPlanner.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/sql/planner/QueryPlanner.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/type/TimestampOperators.java
Outdated
Show resolved
Hide resolved
Summary: Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
b680aab to
84f1f80
Compare
Summary: Pull Request resolved: prestodb#25284 Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable. Differential Revision: D76325048
|
This pull request was exported from Phabricator. Differential Revision: D76325048 |
84f1f80 to
f14f303
Compare
Summary: Currently, the $row_id column is automatically inserted into the output variables for DELETE queries by QueryPlanner. If a connector does not actually use $row_id to implement DELETE, then we should not require it. This makes $row_id optional. If the Optional is empty, then we don't need to project the output variable.
Differential Revision: D76325048