Integrate snapshotting into core #1344

gefjon · 2024-06-06T13:42:11Z

Description of Changes

This PR integrates snapshotting into RelationalDB.

In this initial version, we take a snapshot every 1 million txes written to the commitlog.

Part of this involves making the committed_state module pub(crate) where previously it was private.
In order to preserve the previous visibility of its members, anything that was previously pub is now pub(super).

Notable TODOs, some or all of which may be left as future work:

Mark snapshots newer than the durable_tx_offset as invalid when restoring.
Write a test that exercises snapshot capture and restore. This is challenging because of the large SNAPSHOT_INTERVAL; unless/until we make that configurable, such a test would have to commit 1 million txes.
Make SNAPSHOT_INTERVAL configurable.

API and ABI breaking changes

Nothing broken, but we expose a new API and on-disk format which we will have to maintain in the future.

Expected complexity level and risk

4 - bootstrapping and replaying a database is complex and error-prone, and this PR adds significantly more surface area to that codepath which can break.

Testing

Describe any testing you've done, and any testing you'd like your reviewers to do,
so that you're confident that all the changes work as expected!

Manual testing in Phoebe/snapshot/naive capture #1279 .
Edit the definition of SNAPSHOT_INTERVAL to a small number, possibly 1000, then publish your favorite module, run it to capture several snapshots, then restart, observe the host logs about the snapshot being restored, and verify the state is correct.
- I did this, but you should too.
BitCraft bot test long enough to generate snapshots.

This commit integrates snapshotting into `RelationalDB`. In this initial version, we take a snapshot every 1 million txes written to the commitlog. Part of this involves making the `committed_state` module `pub(crate)` where previously it was private. In order to preserve the previous visibility of its members, anything that was previously `pub` is now `pub(super)`. Notable TODOs, some or all of which may be left as future work: - Mark snapshots newer than the `durable_tx_offset` as invalid when restoring. - Write a test that exercises snapshot capture and restore. This is challenging because of the large `SNAPSHOT_INTERVAL`; unless/until we make that configurable, such a test would have to commit 1 million txes. - Make `SNAPSHOT_INTERVAL` configurable.

Durable DBs have a new dependency on Tokio in order to run the snapshot worker. This was causing tests to fail because the std test harness does not install a Tokio runtime. This commit fixes the test failures by installing a Tokio runtime around every test which constructs a durable DB.

gefjon · 2024-06-06T14:55:53Z

I just ran a test with my counter-module, which is:

use spacetimedb::{schedule, spacetimedb};

#[spacetimedb(table)]
pub struct Counter {
    #[primarykey]
    id: u64,
    count: u64,
}

#[spacetimedb(init)]
pub fn init() {
    Counter::insert(Counter { id: 0, count: 0 }).unwrap();
}

#[spacetimedb(reducer)]
pub fn count(repeat: bool) {
    let mut current = Counter::filter_by_id(&0).unwrap();
    current.count += 1;
    log::info!("Counted to {}", current.count);
    Counter::update_by_id(&0, current);
    if repeat {
        schedule!("0ms", count(true));
    }
}

The test was:

Build StDB with SNAPSHOT_FREQUENCY = 1000.

Run:

for i in {0..3000}; do spacetime call counter count false; done

Interrupt the above when I got bored, which turned out to be at 1989.

Run:

spacetime sql counter 'SELECT * FROM Counter'

and see 1989.

Restart StDB and run the above query again, see the same value.

This worked and the host logged appropriately about restoring from a snapshot (which took ~500 us to capture and < 1 ms to read and restore, since it contained only 8 pages).

Curiously, significantly more TXes happened than I was expecting. I made 1989 calls to the reducer. Including connect/disconnect calls, this amounts to ~6000 TXes. But the tx offset of my snapshot was 7000, and the durable_tx_offset at restart time was 7956. The astute reader will notice that 7956 / 1989 = 4, meaning each reducer call via the CLI resulted in 4 mutable TXes, not the 3 I was expecting.

I'm not worried about this, but I am confused and intrigued.

kim · 2024-06-06T15:25:39Z

each reducer call via the CLI resulted in 4 mutable TXes, not the 3 I was expecting.

This could be the extra disconnect tx introduced in #1288

gefjon · 2024-06-06T15:32:58Z

each reducer call via the CLI resulted in 4 mutable TXes, not the 3 I was expecting.

This could be the extra disconnect tx introduced in #1288

Oh yeah, that'd do it. Thanks Kim!

crates/core/src/db/relational_db.rs

My previous attempt to fix the tests failed to notice that we already have a Tokio runtime in all the `TestDB::durable` tests. This commit uses that instead of spawning a second one.

cloutiertyler · 2024-06-06T20:22:35Z

crates/core/src/db/relational_db.rs

@@ -60,6 +65,7 @@ pub struct RelationalDB {
    // TODO(cloutiertyler): This should not be public
    pub(crate) inner: Locking,
    durability: Option<Arc<dyn Durability<TxData = Txdata>>>,
+    snapshots: Option<Arc<SnapshotWorker>>,


Suggested change

snapshots: Option<Arc<SnapshotWorker>>,

snapshotter: Option<Arc<SnapshotWorker>>,

perhaps?

I wonder if this should be owned by the RelationalDB or if it would just be better to have this be something held by the Host or DatabaseInstanceContext and to put maybe_do_snapshot in there as well. Although I haven't through through exactly who should call maybe_do_snapshot

I think it makes the most sense to store in the RelationalDB because:

It acts a lot like the Durability, which is also owned by the RelationalDB.

Opportunities to call maybe_do_snapshot become increasingly tenuous as you get more abstracted than this.

I would prefer not to leak the CommittedState, or even the Locking, any further than this.

I've renamed all the snapshots vars and members to either snapshot_worker or snapshot_repo as appropriate.

crates/core/src/host/host_controller.rs

crates/core/src/db/relational_db.rs

kim · 2024-06-10T07:44:40Z

crates/core/src/db/relational_db.rs

+            Some((durability.clone(), disk_size_fn)),
+            Some(snapshot_repo),
+        )?
+        .apply(durability);


I think we will need to determine the ConnectedClients entirely from st_clients, as it appears right after this point.

Consider: the database crashed right after a snapshot, so the history suffix is empty -- but there might be connected clients in st_clients. Before we had snapshotting, an empty history meant an empty database.

I believe this is the case now that I've "rebased" onto latest master, including st_clients. Please verify that I'm doing this correctly.

apply_history returns the unmatched __connect__s during replay -- but what I meant was, what if the snapshot's tx offset is the last offset in the commitlog (hence the history is empty)?

Arguably an edge case, so fine by me to merge and fix in a later patch (I can submit one if you want).

Oh, I assumed it would read the connected clients out of the table. I thought we had done away with groveling the logs for connected clients. I think a follow-up would be good, and if you're willing to do so, that would be great!

crates/table/src/table.rs

crates/snapshot/src/lib.rs

crates/core/src/db/datastore/system_tables.rs

crates/core/src/db/datastore/locking_tx_datastore/committed_state.rs

crates/core/src/db/datastore/locking_tx_datastore/datastore.rs

crates/core/src/db/relational_db.rs

Centril · 2024-06-10T12:48:47Z

crates/core/src/db/relational_db.rs

+            ..
+        } = *committed_state;
+
+        if let Err(e) = snapshot_repo.create_snapshot(tables.values_mut(), blob_store, tx_offset) {


(Just an observation.... Nothing to do here: Incidentally, the order of the tables will be deterministic due to IntMap<TableId, _>, but we do not rely on this determinism and .reconstruct_tables(..) will yield BTreeMap<TableId, _> which would have resolved any non-determinism due to the stored order.)

crates/core/src/db/relational_db.rs

…egration

I changed this for testing and accidentally committed at some point.

gefjon · 2024-06-10T14:22:51Z

Rebased and performed manual testing. Appears to still work.

Apparently gets installed in our `thread_spawn_handler` for Rayon. Possibly this happened recently, and when I wrote this change it didn't happen?

Centril · 2024-06-10T14:51:40Z

Nothing new to add wrt. latest changes :)

crates/core/src/db/datastore/locking_tx_datastore/datastore.rs

cloutiertyler · 2024-06-10T15:53:44Z

crates/core/src/db/relational_db.rs

@@ -64,6 +69,7 @@ pub struct RelationalDB {

    inner: Locking,
    durability: Option<Arc<dyn Durability<TxData = Txdata>>>,
+    snapshot_worker: Option<Arc<SnapshotWorker>>,


I wonder if it would be best to move this outside of the RelationalDB (DatabaseEngine future name), to something slightly higher level (e.g. DatabaseInstanceContext). That way we maintain that RelationalDB remains runtime independent. I don't feel it strongly enough to block the review, but it makes me uncomfortable to have the DatabaseEngine/RelationalDB dependent on tokio.

To be clear, I think the ability to snapshot itself would remain inside of RelationalDB, I'm specifically just referring to the scheduling/triggering system.

The Durability, at least as implemented by Local, is also dependent on Tokio. As with the durability, we can avoid depending on Tokio by not supplying a SnapshotRepository when constructing the RelationalDB.

cloutiertyler · 2024-06-10T15:56:42Z

crates/core/src/db/relational_db.rs

+        }
+    }
+
+    fn take_snapshot(committed_state: &RwLock<CommittedState>, snapshot_repo: &SnapshotRepository) {


In reference to the scheduling comment above, I would imagine that this function would be added to the RelationalDB type.

cloutiertyler · 2024-06-10T16:02:50Z

crates/core/src/db/relational_db.rs

+                        .into());
+                    }
+                    let start = std::time::Instant::now();
+                    let res = Locking::restore_from_snapshot(snapshot);


This will eventually need to be a datastore trait function.

cloutiertyler

This now looks good to me. 🙏

crates/table/src/table.rs

…ting into core

Shubham8287

Checked table_size, row_count related logic. Looks good.

When opening a `RelationalDB`, determine the set of dangling clients (`ConnectedClients`) by scanning the `st_clients` table instead of tracking unmatched `__identity_connected__` calls during history replay. We left the replay tracking in place in #1288, treating the commitlog as the sole source of truth. With #1344 (snapshotting), this is no longer correct: the snapshot may contain rows in `st_clients`, but leave no history suffix for replay.

…#1344)" This reverts commit 6c45e76.

gefjon requested review from kim, Centril and cloutiertyler June 6, 2024 13:42

gefjon mentioned this pull request Jun 6, 2024

Phoebe/snapshot/naive capture #1279

Closed

2 tasks

gefjon added 2 commits June 6, 2024 10:38

fmt

fef1a3c

Fix more tests

a7ca2fa

kim reviewed Jun 6, 2024

View reviewed changes

crates/core/src/db/relational_db.rs Outdated Show resolved Hide resolved

gefjon added 2 commits June 6, 2024 11:42

Use the existing Runtime in tests rather than spawning a new one

7dcf1c3

My previous attempt to fix the tests failed to notice that we already have a Tokio runtime in all the `TestDB::durable` tests. This commit uses that instead of spawning a second one.

invalidate_newer_snapshots

0019a84

gefjon mentioned this pull request Jun 6, 2024

core: Store address, owner and program bytes in st_module #1305

Merged

cloutiertyler reviewed Jun 6, 2024

View reviewed changes

More descriptive names for snapshot repos and workers

1cb537d

kim reviewed Jun 10, 2024

View reviewed changes

Centril reviewed Jun 10, 2024

View reviewed changes

gefjon added 3 commits June 10, 2024 10:08

Merge remote-tracking branch 'origin/master' into phoebe/snapshot/int…

6fc3e1e

…egration

Restore SNAPSHOT_FREQUENCY of 1 million

b9dbb59

I changed this for testing and accidentally committed at some point.

Clippy and fmt

030e656

gefjon requested review from kim, Centril and cloutiertyler June 10, 2024 14:23

Remove tokio enter guard

31f25ba

Apparently gets installed in our `thread_spawn_handler` for Rayon. Possibly this happened recently, and when I wrote this change it didn't happen?

gefjon added 2 commits June 10, 2024 10:51

Comment describing committed vs durable for snapshots

70edd12

Address Mazdak's review; compute table.row_count

1b46b54

Centril approved these changes Jun 10, 2024

View reviewed changes

cloutiertyler reviewed Jun 10, 2024

View reviewed changes

crates/core/src/db/datastore/locking_tx_datastore/datastore.rs Show resolved Hide resolved

cloutiertyler reviewed Jun 10, 2024

View reviewed changes

crates/core/src/db/datastore/locking_tx_datastore/datastore.rs Outdated Show resolved Hide resolved

cloutiertyler reviewed Jun 10, 2024

View reviewed changes

crates/core/src/db/datastore/locking_tx_datastore/datastore.rs Outdated Show resolved Hide resolved

cloutiertyler reviewed Jun 10, 2024

View reviewed changes

Address Tyler's comments

bd3e3fc

cloutiertyler reviewed Jun 10, 2024

View reviewed changes

bfops added the release-any To be landed in any release window label Jun 10, 2024

cloutiertyler approved these changes Jun 10, 2024

View reviewed changes

gefjon added 2 commits June 10, 2024 12:50

Set row count metric when restoring from a snapshot

d8477b6

Recompute rdb_table_size when restoring from a snapshot

caece89

Centril reviewed Jun 10, 2024

View reviewed changes

crates/table/src/table.rs Outdated Show resolved Hide resolved

crates/table/src/table.rs Outdated Show resolved Hide resolved

Address Mazdak's latest review

0bcd9bb

bfops pushed a commit that referenced this pull request Jun 10, 2024

[release/candidate/v0.10.0]: Manually apply #1344: Integrate snapshot…

ecf058a

…ting into core

kim approved these changes Jun 10, 2024

View reviewed changes

bfops pushed a commit that referenced this pull request Jun 10, 2024

[release/candidate/v0.10.0]: Manually apply #1344: Integrate snapshot…

d1136e4

…ting into core

joshua-spacetime assigned gefjon Jun 10, 2024

joshua-spacetime added release-0.10 and removed release-any To be landed in any release window labels Jun 10, 2024

Shubham8287 approved these changes Jun 10, 2024

View reviewed changes

kim mentioned this pull request Jun 11, 2024

core: Determine dangling clients from st_clients #1366

Merged

gefjon added this pull request to the merge queue Jun 11, 2024

Merged via the queue into master with commit 6c45e76 Jun 11, 2024
9 checks passed

bfops pushed a commit that referenced this pull request Jun 12, 2024

[release/candidate/v0.10.0]: Revert "Integrate snapshotting into core (…

a001952

…#1344)" This reverts commit 6c45e76.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate snapshotting into core #1344

Integrate snapshotting into core #1344

gefjon commented Jun 6, 2024 •

edited

Loading

gefjon commented Jun 6, 2024

kim commented Jun 6, 2024

gefjon commented Jun 6, 2024

cloutiertyler Jun 6, 2024

gefjon Jun 6, 2024

gefjon Jun 7, 2024

kim Jun 10, 2024

gefjon Jun 10, 2024

kim Jun 10, 2024 •

edited

Loading

gefjon Jun 10, 2024

kim Jun 11, 2024

Centril Jun 10, 2024

gefjon commented Jun 10, 2024

Centril commented Jun 10, 2024

cloutiertyler Jun 10, 2024

cloutiertyler Jun 10, 2024

gefjon Jun 10, 2024

cloutiertyler Jun 10, 2024

cloutiertyler Jun 10, 2024

cloutiertyler left a comment

Shubham8287 left a comment

	snapshots: Option<Arc<SnapshotWorker>>,
	snapshotter: Option<Arc<SnapshotWorker>>,

Integrate snapshotting into core #1344

Integrate snapshotting into core #1344

Conversation

gefjon commented Jun 6, 2024 • edited Loading

Description of Changes

API and ABI breaking changes

Expected complexity level and risk

Testing

gefjon commented Jun 6, 2024

kim commented Jun 6, 2024

gefjon commented Jun 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kim Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gefjon commented Jun 10, 2024

Centril commented Jun 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloutiertyler left a comment

Choose a reason for hiding this comment

Shubham8287 left a comment

Choose a reason for hiding this comment

gefjon commented Jun 6, 2024 •

edited

Loading

kim Jun 10, 2024 •

edited

Loading