-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Implement GC for Column Families. #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: cmake
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 23 of 50 files at r1, 27 of 27 files at r2, all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @marete)
column_family.cc line 209 at r2 (raw file):
return Status(); } case google::bigtable::admin::v2::GcRule::kIntersection: {
Shouldn't we extract this body into ApplyGCRuleIntersection() to follow the general layout of the code?
Or maybe to the contrary, we should inline ApplyGCRuleMaxNumVersions and so on?
column_family.cc line 212 at r2 (raw file):
auto const& rules = gc_rule.intersection().rules(); for (auto it = cells_.begin(); it != cells_.end();) { auto delete_it = true;
Sanity check: does Cloud Bigtable also remove all the cells if the gc rule is to intersect no gc rules?
I know that it's correct both mathematically (as in all(empty_set) is true) and logically using the definition from the protobuf ("// Delete cells that would be deleted by every nested rule."), but maybe Cloud Bigtable validates that the set of intersected rules is nonempty, so we could lose an edge case by validating it too?
column_family.cc line 235 at r2 (raw file):
break; } case google::bigtable::admin::v2::GcRule::kUnion: {
Shouldn't we extract this body into ApplyGCRuleUnion() to follow the general layout of the code?
Or maybe to the contrary, we should inline ApplyGCRuleMaxNumVersions and so on?
column_family.cc line 279 at r2 (raw file):
std::advance(it, n); for (; it != cells_.end();) { it = cells_.erase(it);
[nit] I think it can be done with cells_.erase(it, cells_.end())
column_family.cc line 286 at r2 (raw file):
auto now = std::chrono::duration_cast<std::chrono::milliseconds>( std::chrono::system_clock::now().time_since_epoch()); auto not_before = now - std::chrono::duration_cast<std::chrono::milliseconds>(
Maybe let's use protobuf::util::TimeUtil::DurationToMilliseconds(max_age) (it needs #include <google/protobuf/util/time\_util.h>) instead?
column_family.cc line 291 at r2 (raw file):
not_before - std::chrono::duration_cast<std::chrono::milliseconds>( std::chrono::nanoseconds(max_age.nanos())); // Iterate from the back since the oldest cells are at the back
[nit] I think we could do it with something like
auto newest_before = cells_.upper_bound(not_before);
cells_.erase(newest_before, cells_.end());column_family.cc line 312 at r2 (raw file):
auto now = std::chrono::duration_cast<std::chrono::milliseconds>( std::chrono::system_clock::now().time_since_epoch()); auto not_before = now - std::chrono::duration_cast<std::chrono::milliseconds>(
After I discovered protobuf::util::TimeUtil::DurationToMilliseconds(max_age) (it needs #include <google/protobuf/util/time\_util.h>), I think it got simple enough to be inlined.
If you don't want to inline, it's fine by me - just simplify the not_before calculation.
column_family.cc line 358 at r2 (raw file):
} return false; break;
maybe just return TimestampIsOlderThan(it->first, rule.max_age());? break is also redundant then
column_family.cc line 360 at r2 (raw file):
break; } case google::bigtable::admin::v2::GcRule::kMaxNumVersions: {
[my fully skippable musing on the correctness of the GC operation]
It took me a long time to realize that the result of this whole GC operation does not depend on the order of execution on cells, because all the available conditions (versions and age (and combinations of them)) are such that for two cells, A and B, and two conditions, C1 and C2, it's impossible for C1 to be true for A and false for B and simultaneously for C2 to be false for A and true for B.
If it were possible, such a scenario could happen:
- there's a rule that is a union of something (called X) and max_num_versions
- an old cell is checked first: it survives X, but is deleted because of max_num_versions
- a new cell is checked second: it is deleted because of X
Then, our implementation of going from newest cell to the oldest cell could behave significantly differently than the alternative, but - as I noted in the begining - it's impossible, so everything is fine 😃
column_family.cc line 376 at r2 (raw file):
} auto it_2 = cells_.begin();
[nit] a more descriptive name? first_deleted_it?
column_family.cc line 383 at r2 (raw file):
// are in the reverse order of timestamps. // // FIXME: This operation is linear in the number of cells in the
I think we can just pass the "rank" (idx in the non-materialized list of cell_'s keys) of the currently checked cell from the caller.
Then we could just return rank < n.
column_family.cc line 408 at r2 (raw file):
} return true;
Sanity check: again (just like in RunGC()) does it work in Cloud Bigtable like this or do they validate that the rule set is nonempty?
column_family.cc line 435 at r2 (raw file):
// NOLINTEND(misc-no-recursion) void ColumnRow::ApplyGCRuleIntersection(
unused, see the other comment in RunGC()
column_family.cc line 439 at r2 (raw file):
/*rules*/) {} void ColumnRow::ApplyGCRuleUnion(
unused, see the other comment in RunGC()
column_family.cc line 729 at r2 (raw file):
// recursion in the GC thread later on (and therefore protecting // from a stack overflow).
[micronit] extra-looking line
column_family.cc line 731 at r2 (raw file):
std::size_t gc_rule_size = gc_rule.ByteSizeLong(); if (gc_rule_size > 500) {
[nit] maybe this constant should go to bigtable_limits.h?
table.h line 53 at r2 (raw file):
class Table : public std::enable_shared_from_this<Table> { public: ~Table() {
Shouldn't we wake the GC thread to avoid waiting for a whole minute?
According to the docs, we should also hold the mutex when setting stop_flag_ so that it is correctly "published", so probably it should be something like this:
{
std::unique_lock<std::mutex> lock(gc_mtx_);
stop_flag_ = true;
}
cv_.notify_one();table.cc line 91 at r2 (raw file):
void Table::StartGCThreadOnce() { std::call_once(start_gc_once_flag_, &Table::StartGCThread, this);
Why do we need it? Don't we just start it in a single place, the Table's constructor?
table.cc line 155 at r2 (raw file):
} auto cf = ColumnFamily::ConstructColumnFamily(opt_value_type, opt_gc_rule);
[nit] Doesn't it make the no-argument constructor obsoloete? Maybe we can remove it and replace its only use with calls to this function with empty optionals? Or maybe we should make this static function a constructor?
table.cc line 159 at r2 (raw file):
return cf.status(); }
[micronit] redundant-looking line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 21 unresolved discussions (waiting on @marete)
table.cc line 82 at r2 (raw file):
auto status = RunGC(); if (status.ok()) { std::cerr << "RunGC stopped with error: " << status.message()
Should it be cerr or rather some log function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
Reviewable status: 45 of 50 files reviewed, 20 unresolved discussions (waiting on @prawilny)
column_family.cc line 209 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Shouldn't we extract this body into
ApplyGCRuleIntersection()to follow the general layout of the code?Or maybe to the contrary, we should inline
ApplyGCRuleMaxNumVersionsand so on?
Done.
column_family.cc line 212 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Sanity check: does Cloud Bigtable also remove all the cells if the gc rule is to intersect no gc rules?
I know that it's correct both mathematically (as in
all(empty_set)is true) and logically using the definition from the protobuf ("// Delete cells that would be deleted by every nested rule."), but maybe Cloud Bigtable validates that the set of intersected rules is nonempty, so we could lose an edge case by validating it too?
This was a bug. Thanks for noticing. Fixed (I now check if the vector of rules is empty in the respective functions and return immediately if empty.
column_family.cc line 235 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Shouldn't we extract this body into
ApplyGCRuleUnion()to follow the general layout of the code?Or maybe to the contrary, we should inline
ApplyGCRuleMaxNumVersionsand so on?
Done.
column_family.cc line 279 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[nit] I think it can be done with
cells_.erase(it, cells_.end())
Done.
column_family.cc line 286 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Maybe let's use
protobuf::util::TimeUtil::DurationToMilliseconds(max_age)(it needs#include <google/protobuf/util/time\_util.h>) instead?
Done.
column_family.cc line 291 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[nit] I think we could do it with something like
auto newest_before = cells_.upper_bound(not_before); cells_.erase(newest_before, cells_.end());
The function is now:
void ColumnRow::ApplyGCRuleMaxAge(protobuf::Duration const& max_age) {
std::chrono::milliseconds max_age_ms(
protobuf::util::TimeUtil::DurationToMilliseconds(max_age));
auto cut_off_ms = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::steady_clock::now().time_since_epoch()) -
max_age_ms;
auto newest_to_delete = cells_.upper_bound(cut_off_ms);
cells_.erase(newest_to_delete, cells_.end());
}
column_family.cc line 312 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
After I discovered
protobuf::util::TimeUtil::DurationToMilliseconds(max_age)(it needs#include <google/protobuf/util/time\_util.h>), I think it got simple enough to be inlined.If you don't want to inline, it's fine by me - just simplify the not_before calculation.
Done.
column_family.cc line 358 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
maybe just
return TimestampIsOlderThan(it->first, rule.max_age());?breakis also redundant then
Done.
column_family.cc line 376 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[nit] a more descriptive name?
first_deleted_it?
Done.
column_family.cc line 383 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
I think we can just pass the "rank" (idx in the non-materialized list of
cell_'s keys) of the currently checked cell from the caller.
Then we could just returnrank < n.
Please clarify.
column_family.cc line 408 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Sanity check: again (just like in
RunGC()) does it work in Cloud Bigtable like this or do they validate that the rule set is nonempty?
Thanks for noticing. I am sure that under no circumstances can an empty rules vector cause a cell to be deleted. Now checking everywhere that I shoud.
column_family.cc line 435 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
unused, see the other comment in RunGC()
Done.
column_family.cc line 439 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
unused, see the other comment in RunGC()
Done.
column_family.cc line 729 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[micronit] extra-looking line
Done.
column_family.cc line 731 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[nit] maybe this constant should go to
bigtable_limits.h?
Done.
table.h line 53 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Shouldn't we wake the GC thread to avoid waiting for a whole minute?
According to the docs, we should also hold the mutex when setting
stop_flag_so that it is correctly "published", so probably it should be something like this:{ std::unique_lock<std::mutex> lock(gc_mtx_); stop_flag_ = true; } cv_.notify_one();
Fixed the locking. How do you wake the thread?
table.cc line 82 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Should it be
cerror rather some log function?
We are using cerr also in the emulator main binary. If you think it is worth it, I can use absl::log but that will bring in a new dependency (and the absl in Ubuntu 24.04 did not have the logging library for example, but we can still download it instead).
table.cc line 91 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
Why do we need it? Don't we just start it in a single place, the Table's constructor?
I wanted to be defensive since I recall previously constructing a Table by other means other that calling CreateTable, I think in test code.
table.cc line 155 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[nit] Doesn't it make the no-argument constructor obsoloete? Maybe we can remove it and replace its only use with calls to this function with empty optionals? Or maybe we should make this static function a constructor?
The no argument constructor is used to create simple column families for the test code. On the other hand, this factory function can return an error, if the values passed have problems (e.g. if the gc_rule is too large when serialized).
I vote to keep it that way but also I can change this if you prefer.
table.cc line 159 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
[micronit] redundant-looking line.
Deliberate for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prawilny reviewed 8 of 9 files at r3, all commit messages.
Reviewable status: 51 of 52 files reviewed, 4 unresolved discussions (waiting on @marete)
column_family.cc line 383 at r2 (raw file):
Previously, marete (Brian Gitonga Marete) wrote…
Please clarify.
Sorry for being too brief!
I meant changing GCRuleEraseVerdict()'s arguments from
(GcRule const& rule, std::map<(...)>::const_iterator it)
to
(GcRule const& rule, std::map<(...)>::const_iterator it, size_t idx)
Then the check would be - as I wrote before - rank < idx.
It seems to me we can do it without complicating the logic too much - only bumping idx when entering if (!delete_it) branches in ApplyGCRuleIntersection() and ApplyGCRuleUnion().
Does it sound good?
table.h line 53 at r2 (raw file):
Previously, marete (Brian Gitonga Marete) wrote…
Fixed the locking. How do you wake the thread?
By calling cv_.notify_one() like you just did :)
From my understanding (I'm not sure of it!), since the GC thread waited on this condition variable, by omitting this call before the change, we forced it to wake up only after the wait timed out, blocking the destructor (on join()). Now everything should be alright.
table.cc line 82 at r2 (raw file):
Previously, marete (Brian Gitonga Marete) wrote…
We are using cerr also in the emulator main binary. If you think it is worth it, I can use absl::log but that will bring in a new dependency (and the absl in Ubuntu 24.04 did not have the logging library for example, but we can still download it instead).
In the main binary, we just use it to print a single error message before exiting.
Here, we print something that might be written many times by a (nearly 😉) well-behaved emulator.
Still, I can live with cerr, but let's ask Marek before merging this. (I just sent the question, will let you know once I receive a response)
table.cc line 91 at r2 (raw file):
Previously, marete (Brian Gitonga Marete) wrote…
I wanted to be defensive since I recall previously constructing a Table by other means other that calling CreateTable, I think in test code.
I didn't manage to find this code quickly, but I can live with it.
table.cc line 155 at r2 (raw file):
Previously, marete (Brian Gitonga Marete) wrote…
The no argument constructor is used to create simple column families for the test code. On the other hand, this factory function can return an error, if the values passed have problems (e.g. if the gc_rule is too large when serialized).
I vote to keep it that way but also I can change this if you prefer.
No strong feelings on this one, we can let it be as it is now. I mostly wanted to bring it to your attention.
column_family.cc line 277 at r3 (raw file):
// Note that since a column family GCRule configuration must serialize // to at most 500 bytes // (https://github.com/googleapis/googleapis/blob/6d3a7f1b08c60a00926f7b15a1db69ec71bf501a/google/bigtable/admin/v2/table.proto#L343)
[nit] Maybe this 500 should point to bigtable_limits.h? It'd be cleaner where the exact value comes from.
(then bigtable_limits.h could point to this proto line)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prawilny reviewed 1 of 9 files at r3.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @marete)
gc_test.cc line 56 at r3 (raw file):
cf_->SetCell("row2", "col1", 100_ms, "v9"); cf_->SetCell("row2", "col1", 200_ms, "v10"); cf_->SetCell("row2", "col2", 300_ms, "v11");
[nit] an extra line before this line for readability?
gc_test.cc line 75 at r3 (raw file):
if (row_it == cf_->end()) return 0; auto col_it = row_it->second.find(col_key); if (col_it == row_it->second.end()) return 0;
[nit] isn't it redundant? I thought std::distance(EMPTY.begin(), EMPTY.end()) equals 0.
gc_test.cc line 109 at r3 (raw file):
// Each column keeps its 3 newest cells (or all if it has ≤3) EXPECT_EQ(3, CountCellsInColumn("row1", "col1")); // was 5, keeps 3 newest
We don't assert that it is the newest cells that remained.
gc_test.cc line 198 at r3 (raw file):
} TEST_F(GCTest, MaxAgeWithNanoseconds) {
I don't feel this test gives us anything over MaxAgeRemovesOldCells - we only learn that our function doesn't fail when nanos field is set.
I think we should either make it useful somehow or remove it.
gc_test.cc line 249 at r3 (raw file):
// max_age(30s): removes all cells (test data is very old, near Unix epoch) // Result: cells kept by max_num_versions are preserved (9 cells) EXPECT_EQ(9, CountTotalCells());
We don't assert that it is the newest cells that remained.
gc_test.cc line 298 at r3 (raw file):
// Old cells removed by max_age, recent cells survive both rules // Recent cells: pass max_age and there's only 1 per column (≤2) EXPECT_EQ(2, CountTotalCells());
We don't assert that it is the newest cells that remained.
gc_test.cc line 349 at r3 (raw file):
// Intersection of unions: recent cell survives because it passes max_age // rules Old cells removed, but recent cell demonstrates selective GC EXPECT_EQ(1, CountTotalCells());
We don't assert that it is the newest cell that remained.
gc_test.cc line 388 at r3 (raw file):
// Union of intersections: recent cells survive because they pass max_age // rules Old cells removed, but recent cells demonstrate selective GC EXPECT_EQ(2, CountTotalCells());
We don't assert that it is the newest cells that remained.
gc_test.cc line 505 at r3 (raw file):
} TEST_F(GCTest, ColumnRowGCDirect) {
Please make a more descriptive name, I don't understand what this test is supposed to verify.
gc_test.cc line 507 at r3 (raw file):
TEST_F(GCTest, ColumnRowGCDirect) { auto row_it = cf_->find("row1"); ASSERT_NE(row_it, cf_->end());
Does it pass? Shouldn't it fail before AddTestData() is called?
gc_test.cc line 574 at r3 (raw file):
} TEST_F(GCTest, MaxAgeZeroDurationRemovesAll) {
[nit] MaxAgeZeroDurationRemovesAll => MaxAgeZeroRemovesAll (extra "duration")?
gc_test.cc line 615 at r3 (raw file):
} TEST_F(GCTest, MaxAgeBoundaryCondition) {
What does this test give us over other max age tests?
gc_test.cc line 643 at r3 (raw file):
// The two recent cells (50min and 10sec old) should remain EXPECT_EQ(2, CountTotalCells());
We don't assert that it's them what remain.
gc_test.cc line 646 at r3 (raw file):
} TEST_F(GCTest, NestedRulesWithErrorInMiddle) {
I wouldn't say that the error is in the middle - it is in one of the leaves of the tree.
I think we should either reword the name, adjust the test, or remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @marete)
table.cc line 82 at r2 (raw file):
Previously, prawilny (Adam Czajkowski) wrote…
In the main binary, we just use it to print a single error message before exiting.
Here, we print something that might be written many times by a (nearly 😉) well-behaved emulator.Still, I can live with cerr, but let's ask Marek before merging this. (I just sent the question, will let you know once I receive a response)
I talked with Marek, he's perfectly fine with logging to stderr.
But he asked a very good question: why can GC fail? I thought that there must've been some fallible operation on the way, but when I now took a look at the code, it seems that we could've caught all the problems when setting the GCRule stored in the table (because all the problems returned are InvalidArgumentError).
So I think we should make this function infallable (unless Bigtable allows to create invalid GC rules too).
Includes a full coverage test suite.
Including a full coverage test suite.
This change is