Skip to content

Conversation

@martintomazic
Copy link
Contributor

@martintomazic martintomazic commented Oct 10, 2025

Closes #6352.

Considerations - #6355 (comment)

@netlify
Copy link

netlify bot commented Oct 10, 2025

Deploy Preview for oasisprotocol-oasis-core canceled.

Name Link
🔨 Latest commit 51407eb
🔍 Latest deploy log https://app.netlify.com/projects/oasisprotocol-oasis-core/deploys/6916f9705e6c0900073ea9f1

@martintomazic martintomazic force-pushed the martin/buxfix/runtime-pruning branch from b63f4ef to 26e3fd9 Compare October 10, 2025 09:57
@martintomazic
Copy link
Contributor Author

martintomazic commented Oct 10, 2025

This PR tries to fix existing problem simply, by avoiding being "perfect".

However, for an ideal solution, we should first answer the question below, which also serves as a complementary context to what we are trying to solve here:

Who should be responsible for pruning of the runtime state?

Context

Seemingly trivial refactor clashes heavily with our current design/code organization.

Since it is not the first time that we are clashing with this specific abstractions, I want this change to also be a step towards making our code more maintainable instead of complicating it further.

How things work currently

There are two databases used for maintaining runtime state:

  1. Light history (history.History) - stores runtime block headers.
  2. State DB (NodeDB) - stores runtime state.

Who populates state instances?

Light history is populated by the indexer that continuously updates it by (re)indexing consensus blocks.

State DB is is populated by the storage committee worker by subscribing to incoming block headers.

The incoming block headers are populated by the common committee worker that the storage committee worker subscribes to via NodeHook implementation. Common committee worker gets block headers by subscribing to History.WatchCommittedBlocks. Moreover, storage committee worker has a direct access to the light history, to fallback on it in case of missing block headers. This raises a question why the storage committee worker does not watch light history directly, making node hook redundant... (out of scope).

Who handles the lifetime of the state instances?

The lifetime of the light history and corresponding indexer is currently owned by the runtime registry (see).

State DB, on the other hand is created by the storage worker (see), yet closed by the runtime registry.

Who is responsible for the pruning.

Currently pruning of the runtime state (both History and NodeDB) is the light history responsibility. This is the main reason why:

  1. History API has out-grown its scope.
  2. go/runtime: NodeDB pruning nested in the light history prune transaction slows down the node #6352 bug.
  3. Workaround where storage worker prune handler also has a side effect of actual pruning (causing another issue go/runtime: NodeDB pruning nested in the light history prune transaction slows down the node #6352 (comment)).

Steps towards cleaner design

We should get there incremental....

  1. Light history shouldn't know anything about state syncing nor about pruning of the runtime state.
    • Thin interface over underlying DB, possibly implementing minimal business logic/caching.
    • StorageSyncCheckpoint, LastStorageSyncedRound, WaitRoundSynced, GetBlock, GetAnnotatedBlock, WatchBlocks, Prune and Pruner are out of scope.
    • Factories are anti-patterns.
  2. Pruning of the runtime state is independent process.
  3. Transparent lifetime of resources and processes:
    • A process opening a resource or starting a sub-process should be also responsible for closing it.
    • E.g. opening and closing of NodeDB via two different workers is problematic.
  4. State syncing should be independent worker.
    • Accept history.History interface to populate NodeDB by fetching storage diffs.
  5. Storage committee worker should orchestrate many smaller workers (#6307).

So who should be responsible for runtime state pruning?

Looking how closely related Light history and State DB are, especially as we have the cases where you need both, I believe there should be a single process responsible for instantiating and closing both. Other processes can receive it as a parameter.

Storage committee worker

Observe that the storage committee worker is only started when you have a local storage and configured runtimes - see. So the first question should be do we have or envision a scenario where light history is being indexed but the storage worker should be disabled? Even if this is the case we could work around this. update: we have such scenario: #6385 (e.g. stateless client with configured paratime).

I am inclined towards making the storage worker responsible for Light history and State DB lifetime, populating and pruning them. Especially if we refactor the storage worker into many smaller ones. Note this would also mean moving the indexer into it.

Finally, I believe this worker should export methods like LastStorageSyncedRound, WaitRoundSynced, GetBlock, GetAnnotatedBlock, WatchBlocks, given it would have a direct access to the underlying resources and the actual workers that populate and prune them. Observe that methods like History.StorageSyncCheckpoint, becomes un-needed in this setup.

Other workers that rely on theses methods should define an interface and accept the storage committee worker as a parameter (therefore casting it to the subset of interface methods).

Runtime registry

This is alternative to the above. Currently we don't have direct access to the NodeDB in the runtime registry unless we cast the Storage backend , that is actually a LocalBackend as can be seen from its registration at storage worker.

Furthermore I am not really convinced by the idea of registry being responsible for pruning.

Conslusion

None of the solutions will be trivial to refactor to, so would appreciate any input here to get there incrementally.

@martintomazic
Copy link
Contributor Author

Tried to solve both problems described in #6352.

Opened a simple solution, that also suffers from some compromises, described commit by commit. For the clean solution we probably have to solve problems described in the mini design doc first :/.

Prior to making this PR ready for review, writing tests and ensuring it works as expected (CI is green but haven't tested) I would appreciate if we could first discuss the high level direction.

@codecov
Copy link

codecov bot commented Oct 31, 2025

Codecov Report

❌ Patch coverage is 66.07143% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.67%. Comparing base (e6074e5) to head (51407eb).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
go/consensus/cometbft/abci/prune.go 23.07% 8 Missing and 2 partials ⚠️
go/worker/storage/committee/prune.go 80.00% 4 Missing and 2 partials ⚠️
go/worker/storage/committee/worker.go 60.00% 1 Missing and 1 partial ⚠️
go/runtime/history/history.go 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6355      +/-   ##
==========================================
+ Coverage   64.25%   64.67%   +0.42%     
==========================================
  Files         698      698              
  Lines       68047    68070      +23     
==========================================
+ Hits        43721    44022     +301     
+ Misses      19321    19013     -308     
- Partials     5005     5035      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@martintomazic martintomazic force-pushed the martin/buxfix/runtime-pruning branch from bab14de to 4b56de4 Compare November 1, 2025 21:36
@martintomazic
Copy link
Contributor Author

Freshly rebased on top of #6384, that also briefly touched this issue.

Copy link
Contributor

@peternose peternose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that first 5 commits look good, even if we change them latter. So we could merge and continue in another PR.

//
// If an error is returned, pruning is aborted and the height is
// not pruned from history.
// CanPruneConsensus is called to check if the specified height can be pruned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just CanPrune? From consensus point of view, adding Consensus is just redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, although is easier to make sense of when implementing it inside go/runtime/history/history.go and also "symmetric" with CanPruneRuntime, so either or...

My suggestion would be to also get rid of

    // Note that this can be called for the same height multiple
	// times (e.g., if one of the handlers fails but others succeed
	// and pruning is later retried).

for both given that func Can*() error implies this should be safe...

Not pruning because of the prune handler does not allow it, is
not an error and should not be logged as such.

Similar to #6161
that describes excesive usage of Error logging.
This rename is important to communicate clearly the handler only
checks whether pruning can happen.

Possibly this method could return bool instead of error.
By moving the pruning of the state db out of the prune handler,
we solve issue of nesting NodeDB.Prune (possibly long operation),
in the BadgerDB transaction created during the pruning of the
light history.

Futhermore, this also ensures that light history is always pruned
before the actual state. This is desired as previously light
history pruning had a side effect of deleting the state. If during
pruning there was an error the runtime history would get reverted,
but the state wouldn't. Looking at the `history.History`
`resolveRound`, this unlikely scenario was not anticipated, which
could result in `GetAnnotatedBlock` breaking its contract.

There is additional change of semantics: Previously if your state
had versions older than the earliest version in the light history
pruning would be stuck, as it would try to prune non-earliest
version from the NodeDB. Now, this is fixed but as a cosequence
this could silently start a very hot loop of pruning many versions.
This was done to be consistent with consensus prune handlers
terminology. Notice that consensus accepts a single round whereas
here we take a slice of rounds.

In my opinion pruning of light history in batch was a premature
optimization as time to prune light block << time to prune NodeDB
version.

For the follow-up we may want to unify this, possibly prune
handlers can be only checked for every n-th versions instead
of checking every versions (assuming that if pruning is valid
for N, than it should be also valid for N-1).
@martintomazic martintomazic force-pushed the martin/buxfix/runtime-pruning branch from 20bf6c0 to 51407eb Compare November 14, 2025 09:42
@martintomazic
Copy link
Contributor Author

I think that first 5 commits look good

Agree and kept only those.

Sanity checking semantic change (thinking aloud):

Previously it could happen that state was pruned before the runtime light history (or possibly missing, when e.g. doing runtime state sync from the checkpoint earlier than the last reatained light header block - see). However, it never happened the other way.

This change makes this less likely (except for the late checkpoint scenario), as light history is now always pruned before the State DB, changing the assumptions the old code may have. Therefore we should double check that none of the existing code first queries state / last retained version and then automatically assume light history has corresponding light block for the queried height. I did not find any such place, nor would it be sign of the robust code had I found it.

Double check appreciated.

Follow-up (won't be actively worked on)

Opened #6400, and also referenced it in the code. As stated in the issue I believe the answer to "So who should be responsible for runtime state pruning" is clear: the storage committee worker.

@martintomazic
Copy link
Contributor Author

martintomazic commented Nov 14, 2025

Previously it could happen that state was pruned before the runtime light history... However, it never happened the other way. ... Therefore we should double check that other parts of the code not rely on this somewhere.
I did not find any such place, nor would it be sign of the robust code had I found it.

#6403 made me think I have to check the storage committee worker once more:

  • if version, dbNonEmpty := w.localStorage.NodeDB().GetLatestVersion(); dbNonEmpty {
    var blk *block.Block
    blk, err = w.commonNode.Runtime.History().GetCommittedBlock(ctx, version)
    switch err {
    case nil:
    // Set last synced version to last finalized storage version.
    if _, err = w.flushSyncedState(summaryFromBlock(blk)); err != nil {
    return fmt.Errorf("failed to flush synced state: %w", err)
    }
    default:
    // Failed to fetch historic block. This is fine when the network just went through a
    // dump/restore upgrade and we don't have any information before genesis. We treat the
    // database as unsynced and will proceed to either use checkpoints or sync iteratively.
    w.logger.Warn("failed to fetch historic block",
    "err", err,
    "round", version,
    )
    }
    is not promising as it seems in the corner case we could enter into initGenesis:
    // Initialize genesis from the runtime descriptor.
    isInitialStartup := (cachedLastRound == w.undefinedRound)
    if isInitialStartup {
    w.statusLock.Lock()
    w.status = api.StatusInitializingGenesis
    w.statusLock.Unlock()
    var rt *registryApi.Runtime
    rt, err = w.commonNode.Runtime.ActiveDescriptor(ctx)
    if err != nil {
    return fmt.Errorf("failed to retrieve runtime registry descriptor: %w", err)
    }
    if err = w.initGenesis(ctx, rt, genesisBlock); err != nil {
    return fmt.Errorf("failed to initialize storage at genesis: %w", err)
    }
    }

Update: false alarm

@martintomazic
Copy link
Contributor Author

^^ Tested / sanity checked everything yesterday and I believe if anything the code is more robust now.

Ready for the final review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

go/runtime: NodeDB pruning nested in the light history prune transaction slows down the node

3 participants