Skip to content

Conversation

@chenjian2664
Copy link
Contributor

@chenjian2664 chenjian2664 commented Nov 8, 2024

Description

Currently the update implementation of MERGE update updates the whole row, this pr is going to update the engine merge impl to support partial update.
The main idea is to keep the merge case_number of the UpdateCase in the output of the MergeProcessNode,(this value already kept in the merge_row symbol that in the sourceNode of the MergeProcessNode, so only needs to project it), and then in the MergeWriterNode, the updateSink can perform the update according to the update case.

The partial update also needs to know the mapping of the case number with the relevant columns, thus the ConnectorMetadata#beginMerge is refactored to hold the info.

Here use the Phoenix connector as a example to show how it works.

Additional context and related issues

Prerequisites for #23034

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
() Release notes are required, with the following suggested text:

## Section
* Support partial update in engine.
* Support partial update in Phoenix.

@cla-bot cla-bot bot added the cla-signed label Nov 8, 2024
@chenjian2664 chenjian2664 marked this pull request as draft November 8, 2024 11:04
@github-actions github-actions bot added delta-lake Delta Lake connector bigquery BigQuery connector labels Nov 8, 2024
@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 8 times, most recently from ff5e383 to 75e4c1c Compare November 11, 2024 06:14
@chenjian2664 chenjian2664 changed the title [WIP] Refactor merge to support partial update Refactor merge to support partial update Nov 11, 2024
@chenjian2664 chenjian2664 marked this pull request as ready for review November 11, 2024 07:00
Copy link
Member

@djsagain djsagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looked pretty good to me. Thanks for undertaking this work!

The PR needs Phoenix tests that show that only updated columns are changed. The tests need to cover cases where a single column is updated in a single row and in multiple rows; where non-overlapping sets of columns are updated in different cases; and tests that include UPDATE as well as INSERT and DELETE cases.

We also need @electrum to take a look.

@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 4 times, most recently from 2aca964 to c49ad8b Compare November 13, 2024 12:11
@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 4 times, most recently from 329425e to 550ff10 Compare November 13, 2024 14:45
@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 2 times, most recently from 77fd80d to 3c40911 Compare November 13, 2024 15:15
@chenjian2664
Copy link
Contributor Author

@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 2 times, most recently from 290d09f to 78efc6a Compare November 14, 2024 02:29
@chenjian2664
Copy link
Contributor Author

@chenjian2664
Copy link
Contributor Author

@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 6 times, most recently from 437b674 to c58acdc Compare November 14, 2024 09:52
@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 2 times, most recently from 71ebb33 to f26d04b Compare November 18, 2024 03:03
Copy link
Member

@djsagain djsagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this PR, but we still need @electrum to take a look. I'll ping him now about it.

@chenjian2664 chenjian2664 force-pushed the refactor_merge branch 2 times, most recently from 1d3f6af to 555b183 Compare November 21, 2024 15:03
Copy link
Member

@kokosing kokosing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed initially first commit.

@electrum electrum merged commit f17fdac into trinodb:master Nov 23, 2024
90 of 91 checks passed
@electrum
Copy link
Member

Thanks!

@github-actions github-actions bot added this to the 466 milestone Nov 23, 2024
@chenjian2664 chenjian2664 deleted the refactor_merge branch November 23, 2024 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bigquery BigQuery connector cla-signed delta-lake Delta Lake connector

Development

Successfully merging this pull request may close these issues.

4 participants