Skip to content

Add table_supports_delta_delete property and support delta delete function for Raptor table#13248

Closed
kewang1024 wants to merge 8 commits intoprestodb:masterfrom
kewang1024:raptor_delta_delete
Closed

Add table_supports_delta_delete property and support delta delete function for Raptor table#13248
kewang1024 wants to merge 8 commits intoprestodb:masterfrom
kewang1024:raptor_delta_delete

Conversation

@kewang1024
Copy link
Collaborator

== RELEASE NOTES ==

Raptor Changes

  • Add delta delete

@rongrong
Copy link
Contributor

Does the feature exist already? The PR didn't seem to suggest that. Logically the config to enable the feature should be after the introduction of the feature. If we really want to merge this before the actual feature is available, at least there should not be any release notes.

@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch from adcd26f to caead97 Compare August 19, 2019 18:16
@kewang1024
Copy link
Collaborator Author

kewang1024 commented Aug 19, 2019

Does the feature exist already? The PR didn't seem to suggest that. Logically the config to enable the feature should be after the introduction of the feature. If we really want to merge this before the actual feature is available, at least there should not be any release notes.

So we had an offline discussion that we will be creating a PR to release this feature, and we will have multiple commits. Every commit will be reviewed but not merged. We will be merging the PR after all the commits are finished. @jessesleeping

@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch from caead97 to faad597 Compare August 19, 2019 18:48
@kewang1024 kewang1024 changed the title Add delta_delete_enabled column in Raptor metadata [WIP]Add delta_delete_enabled column in Raptor metadata Aug 19, 2019
@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch 7 times, most recently from 47dd603 to c50d21b Compare August 20, 2019 17:08
@jessesleeping jessesleeping self-requested a review August 20, 2019 18:15
@kewang1024 kewang1024 requested a review from highker August 20, 2019 18:22
Copy link
Contributor

@jessesleeping jessesleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first commit LGTM. Some comments other than the inline ones:

  • Change the commit message to be more specific. We not only added a column to the metadata, but also added a new table property to gate the feature.

@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch from c50d21b to b2bec9d Compare August 23, 2019 21:47
@kewang1024
Copy link
Collaborator Author

All issues addressed, thanks for the review😄! Will start next commit.

@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch 4 times, most recently from 9399634 to b52349e Compare August 26, 2019 12:52
@jessesleeping jessesleeping self-requested a review August 26, 2019 20:49
Copy link
Contributor

@jessesleeping jessesleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed "Add deltaDeleteEnabled in RaptorSplit". Let's make deltaDeleteEnabled an non-optional member in RaptorSplit.

Copy link
Contributor

@jessesleeping jessesleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed " Add delta_shard_uuid in index table".

Make sure the code refactoring in CREATE table SQL construction doesn't break the previous behavior.

Also, maybe add a comment explaining why every newly created index table will have a delta_shard_uuid column.

Copy link
Contributor

@jessesleeping jessesleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed "Add deltaShardUuid to Raptor split". The current code LGTM but it's incomplete right now as we are lacking the commit logic, which is fine :). We will get there soon.

  • We might want to add more tests to RaptorSplitManager once we figure out how to commit delta to index table.

@kewang1024
Copy link
Collaborator Author

Reviewed "Add deltaShardUuid to Raptor split". The current code LGTM but it's incomplete right now as we are lacking the commit logic, which is fine :). We will get there soon.

  • We might want to add more tests to RaptorSplitManager once we figure out how to commit delta to index table.

I agree, will add more tests once the whole logic is through

@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch 3 times, most recently from 4da0984 to 518bb55 Compare August 27, 2019 01:41
@highker highker changed the title [WIP]Add delta_delete_enabled column in Raptor metadata Add delta_delete_enabled column in Raptor metadata Nov 2, 2019
@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch 2 times, most recently from 4196973 to 90a769d Compare November 4, 2019 19:50
@kewang1024 kewang1024 changed the title Add delta_delete_enabled column in Raptor metadata Add table_supports_delta_delete property and support delta delete function for Raptor table Nov 4, 2019
Copy link
Contributor

@jessesleeping jessesleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all none test code.

@jessesleeping jessesleeping requested a review from highker November 5, 2019 01:26
@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch 2 times, most recently from 2a85f23 to f954308 Compare November 5, 2019 18:20
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early comments. Haven't started compactor or rewriter parts yet

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more comments

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new OrcReader method seems to be used quite often; can we make it a helper in a util?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use else for the following branch
  • remove return
  • move clearRollback(); to the end

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Nov 9, 2019

CLA Check
The committers are authorized under a signed CLA.

  • ✅ Ke (2923151cf3bc2b938d223c5378acf24c1f8eef86, 3ce349de0a4ea68b662e3c642c57a68543a45eaa, 5d09ecdbd9ec6f2640d90dc9124f4f399ebca9d4, b3dc2a8a47bf69064f68a66fc84795968e3b1f73, 9bdff076b7c5b4b856b1b9df75d2edcb2cedb3cb, ee87c133ed6b393b6c89e46472abd5ecb86057d7, 59b1c88d70e09737f8bd47ca903bf945e04ae6f9, afdb6ac8eb3133ff88df02a2c61e5d79593a2593, 5b5cd4c091c34d3c38519c9f63dae5300c00bfd8, 80818550110ec314f07af7636501fda034493a2d, 5c341611c08db84f51e234d07245004bca832d16)

@kewang1024 kewang1024 force-pushed the raptor_delta_delete branch 3 times, most recently from bce2641 to 8b35637 Compare November 22, 2019 00:17
@kewang1024
Copy link
Collaborator Author

kewang1024 commented Nov 22, 2019

All the issues addressed except for the delta format.

In order to make review easier, most of the newly added code is in new commits, except some bugs and rebase conflicts are merged into old commits.
Currently we already have too many commits and as @jessesleeping suggested, we will be merging them as one or a few bigger commits later. Therefore for current review, pay more attention to code logic instead of commits info

@jessesleeping will commit the data verification logic, so it's not included here yet.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Enable create table with table_supports_delta_delete properties" minor comments

  • The commit message body is too wide. Keep each line within 72 characters.
  • I would suggest sending out different PRs for each commit. The PR is too large to merge at once.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need a config to flag the table property. Just make it false.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need a config to flag the table property. Just make it false.

According to the early discussion, we want to have the ability to control the default behavior of all the table newly created, instead of making it a manual operation.

Besides, this feature is a good to have, then maybe we don't need to remove it

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this config. Remove

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary, remove.

Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Add tableSupportsDeltaDelete and deltaShardUuid in RaptorSplit": Let's make sure the commit message body is within 72 chars

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this flag if we can use Optional.empty for deltaShardUuid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this flag if we can use Optional.empty for deltaShardUuid

According to the discussion with Jiexi, the design is that we have a table property indicating if this table supports delta_delete. But in the future, we could have the concept of delta_delete as session property.

Thus having these two parameters give us the flexibility and compatibility to future features.

That's why I didn't fix it for last comment, sorry that I didn't explain last time.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move boolean tableSupportsDeltaDelete to the end

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following three fields are confusing. I assume we only need a BitSet to denote what rows to delete and a flag tableSupportsDeltaDelete to denote what kind of delete we wanna apply?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private

@kewang1024
Copy link
Collaborator Author

As @highker suggested, this PR is too large, thus splitting to three PRs.
Will link those three PRs to this one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants