-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support vacuum aggregating index #17231
Conversation
src/query/storages/fuse/src/table_functions/fuse_vacuum_drop_aggregating_index.rs
Outdated
Show resolved
Hide resolved
Is it a It would be better to store it in
Such that each entry has a smaller locking scope and there is no need to introduce a new container protobuf message to store the list. |
52e3f40
to
203e986
Compare
All comments have been resolved. PTAL @b41sh @drmingdrmer. Thanks. |
The PR description should be updated too |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some derived trait implementation looks unnecessary.
Reviewed 11 of 24 files at r1, 13 of 14 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @b41sh, @dantengsky, and @SkyFan2002)
src/meta/app/src/schema/marked_deleted_index_id.rs
line 16 at r5 (raw file):
#[derive(Clone, Debug, Copy, Default, Eq, PartialEq, PartialOrd, Ord)] pub struct MarkedDeletedIndexId {
Does it need to be Default
?
src/meta/app/src/schema/index.rs
line 83 at r5 (raw file):
pub enum MarkedDeletedIndexType { #[default] AGGREGATING = 1,
It's weird to have a Default
implementation for it. In every case the type should be specified explicitly AFAIK.
serde
is not necessary either, is it?
src/meta/app/src/schema/index.rs
line 88 at r5 (raw file):
#[derive(serde::Serialize, serde::Deserialize, Clone, Debug, Eq, PartialEq, Default)] pub struct MarkedDeletedIndexMeta {
Does it really need to be serde
? And Default
does not seem necessary either.
src/meta/app/src/schema/index.rs
line 189 at r5 (raw file):
#[derive(Clone, Debug, PartialEq, Eq)] pub struct GetMarkedDeletedIndexesReply { pub table_indexes: HashMap<u64, Vec<(u64, MarkedDeletedIndexMeta)>>,
Add doc comment explaining the key and value.
src/meta/api/src/schema_api.rs
line 171 at r1 (raw file):
&self, table_id: Option<u64>, tenant: &Tenant,
Always put tenant
at first as a convention.
Code quote:
async fn get_marked_deleted_indexes(
&self,
table_id: Option<u64>,
tenant: &Tenant,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r6, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dantengsky and @SkyFan2002)
4f8e98a
to
5f203f8
Compare
@SkyFan2002 ready to merge? |
Yes. |
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
The
DROP AGGREGATING INDEX
statement cleans up the index's metadata but does not clean up the index's data. This PR implements a new table function to clean up the data of dropped and outside the retention period indexes.Implemention
A new key-value pair is added to the meta-service:
When an index is dropped, along with removing the
name->id->meta
, thefd_marked_deleted_index
key-value pair is added.When a vacuum is triggered, the meta-service will check the
__fd_marked_deleted_index
key. And filter out the indexes that is in retention period withMarkedDeletedIndexMeta.dropped_on
.The vacuum will delete the index data that is not in retention period, by identifying the index files with index id. After that, the meta-service will remove the index meta from the
__fd_marked_deleted_index/table_id/index_id
key.Tests
Type of change
This change is