-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: pipeline builder #1017
feat: pipeline builder #1017
Conversation
.push(StorageHashingStage { clean_threshold: 500_000, commit_threshold: 100_000 }) | ||
// This merkle stage is used only for execute | ||
.push(MerkleStage { is_execute: true }); | ||
.add_stages( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uses the new stage sets: as mentioned, I want to make either the stage sets or the stages optionally ser/de to replace StagesConfig
. This would make this a lot easier. Primary hurdles are:
- The downloaders, but it is possible to make them serializable too.
- Consensus. This should be determined by the chain spec, so should end up being ser/de too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this abstraction.
/// # Panics | ||
/// | ||
/// Panics if the [`Stage`] is not in this set. | ||
fn set<S: Stage<DB> + 'static>(self, stage: S) -> StageSetBuilder<DB> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just convenience to avoid the .build
call if you are only going to override a stage
crates/stages/src/stages/merkle.rs
Outdated
/// | ||
/// This stage should be run with the above two stages, otherwise it is a no-op. | ||
/// | ||
/// This stage is split in two: one for calculating hashes and one for unwinding. TODO: Why? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(cc @rakita?) I am not sure why this is split into an execution and an unwind part, but it should be in the docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is best place to put the docs? merkle stage depends on Hashing stage, it needs to come before Hashing stages for both execution and unwind paths (unwind goes in backwards), this solves unwind ordering that is present in erigon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be in this rustdoc, my question is more precisely: Why does it need to unwind before the hashing stages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bcs of hashing of stages. Let see a example where plain state Acc1
has a balance1 and it gets updated to balance2
In Execution path this would be:
HashedState calculates hash2
of Acc1
with balance2
(balance2 as it is new state)
MerkleState uses hash2
and removed hash1
(previous hash).
Now for unwind order, MerkleStage
comes first but it has only hash2
but it needs hash1
. That is why the order needs to be:
HashingState reverts hash2
to hash1
MerkleState uses hash1
and removes hash2
(new hash that needs to be unwinded)
It is called Unwind MerkleStage
as it is triggered only on unwind, on execution it does nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool, I like the StageSet
abstraction and all the built in StageSets
. Along with the PipelineBuilder
it should make configuring and running the pipeline very easy, especially in #623
} | ||
|
||
/// Run the pipeline in an infinite loop. Will terminate early if the user has specified | ||
/// a `max_block` in the pipeline. | ||
pub async fn run(&mut self, db: Arc<DB>) -> Result<(), PipelineError> { | ||
loop { | ||
let mut state = PipelineState { | ||
events_sender: self.events_sender.clone(), | ||
listeners: self.listeners.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks a bit weird.
but nothing we can do in this PR about it.
looking at self.run_loop(&mut state,
seems unnecessary to clone the listeners into the state, at least here, perhaps there are issues when it comes to the stages.
52eba2a
to
8973d57
Compare
This stage was introduced in #972
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm,
the builder abstraction is very useful indeed
These were introduced in #978
Co-authored-by: Georgios Konstantopoulos <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it
self | ||
} | ||
|
||
/// Disables the given stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smart! when do you think one would use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For us, probably to disable specific indexing stages when in non-archive mode down the line.
For crate consumers, possibly to replace some stages with their own implementation
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #1017 +/- ##
==========================================
- Coverage 75.14% 74.78% -0.37%
==========================================
Files 313 317 +4
Lines 34299 34592 +293
==========================================
+ Hits 25775 25869 +94
- Misses 8524 8723 +199
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Co-authored-by: Georgios Konstantopoulos <[email protected]>
Co-authored-by: Georgios Konstantopoulos <[email protected]>
Co-authored-by: Georgios Konstantopoulos <[email protected]>
Co-authored-by: Georgios Konstantopoulos <[email protected]>
Co-authored-by: Georgios Konstantopoulos <[email protected]>
Co-authored-by: Georgios Konstantopoulos <[email protected]>
Supercedes #963 with a dedicated
PipelineBuilder
. I will make a follow up PR to clean up the config we use in the CLI and will use in #623. The most unwieldy parts are now:StagesConfig
to override values: this is mostly for the CLI. feat(net): test syncing from geth #623 should use the defaults in the stages set. (See last point in Changed)Example usage can be found in
crates/stages/src/lib.rs
andbin/reth/src/node/mod.rs
.Added
Pipeline
using a builder (PipelineBuilder
)StageSet
s that are logical containers of a group of stagesStageSet
s:DefaultStages
(all),OnlineStages
,OfflineStages
,ExecutionStages
andHashingStages
Changed
LinearDownloaderBuilder::build
out to their own methods since we would always passDefault::default()
StagesConfig
later onMerkleStage
to an enum, I didn't findis_execute: bool
to be very clearRemoved
StatusUpdater
from the headers stage - this was never meant to be in here, currently does nothing for us, and makes it easier to configure the header stage