feat: pipeline builder #1017

onbjerg · 2023-01-24T18:20:11Z

Supercedes #963 with a dedicated PipelineBuilder. I will make a follow up PR to clean up the config we use in the CLI and will use in #623. The most unwieldy parts are now:

Constructing downloaders
Easily loading StagesConfig to override values: this is mostly for the CLI. feat(net): test syncing from geth #623 should use the defaults in the stages set. (See last point in Changed)

Example usage can be found in crates/stages/src/lib.rs and bin/reth/src/node/mod.rs.

Added

Adds a way to construct a Pipeline using a builder (PipelineBuilder)
Adds StageSets that are logical containers of a group of stages
Adds a few default StageSets: DefaultStages (all), OnlineStages, OfflineStages, ExecutionStages and HashingStages

Changed

Replaces the pipeline event channel with a multi-listener approach (closes PipelineEvent channel should be unbounded #968)
Moves some arguments in LinearDownloaderBuilder::build out to their own methods since we would always pass Default::default()
Adds defaults to stages where possible - this is intended to supercede the duplicated config in StagesConfig later on
Refactored MerkleStage to an enum, I didn't find is_execute: bool to be very clear

Removed

Removes StatusUpdater from the headers stage - this was never meant to be in here, currently does nothing for us, and makes it easier to configure the header stage

onbjerg · 2023-01-24T18:23:30Z

bin/reth/src/node/mod.rs

-            .push(StorageHashingStage { clean_threshold: 500_000, commit_threshold: 100_000 })
-            // This merkle stage is used only for execute
-            .push(MerkleStage { is_execute: true });
+            .add_stages(


Uses the new stage sets: as mentioned, I want to make either the stage sets or the stages optionally ser/de to replace StagesConfig. This would make this a lot easier. Primary hurdles are:

The downloaders, but it is possible to make them serializable too.

Consensus. This should be determined by the chain spec, so should end up being ser/de too.

Love this abstraction.

crates/net/downloaders/src/headers/linear.rs

onbjerg · 2023-01-24T18:25:47Z

crates/stages/src/pipeline/set.rs

+    /// # Panics
+    ///
+    /// Panics if the [`Stage`] is not in this set.
+    fn set<S: Stage<DB> + 'static>(self, stage: S) -> StageSetBuilder<DB> {


This is just convenience to avoid the .build call if you are only going to override a stage

crates/stages/src/stages/headers.rs

crates/stages/src/stages/merkle.rs

onbjerg · 2023-01-24T18:27:56Z

crates/stages/src/stages/merkle.rs

+///
+/// This stage should be run with the above two stages, otherwise it is a no-op.
+///
+/// This stage is split in two: one for calculating hashes and one for unwinding. TODO: Why?


(cc @rakita?) I am not sure why this is split into an execution and an unwind part, but it should be in the docs

Where is best place to put the docs? merkle stage depends on Hashing stage, it needs to come before Hashing stages for both execution and unwind paths (unwind goes in backwards), this solves unwind ordering that is present in erigon.

It should be in this rustdoc, my question is more precisely: Why does it need to unwind before the hashing stages?

bcs of hashing of stages. Let see a example where plain state Acc1 has a balance1 and it gets updated to balance2

In Execution path this would be:
HashedState calculates hash2 of Acc1 with balance2 (balance2 as it is new state)
MerkleState uses hash2 and removed hash1 (previous hash).

Now for unwind order, MerkleStage comes first but it has only hash2 but it needs hash1. That is why the order needs to be:
HashingState reverts hash2 to hash1
MerkleState uses hash1 and removes hash2 (new hash that needs to be unwinded)

It is called Unwind MerkleStage as it is triggered only on unwind, on execution it does nothing.

crates/stages/src/stages/mod.rs

Rjected

This is really cool, I like the StageSet abstraction and all the built in StageSets. Along with the PipelineBuilder it should make configuring and running the pipeline very easy, especially in #623

crates/net/downloaders/src/headers/linear.rs

crates/stages/src/pipeline/event.rs

mattsse · 2023-01-25T10:25:22Z

crates/stages/src/pipeline/mod.rs

    }

    /// Run the pipeline in an infinite loop. Will terminate early if the user has specified
    /// a `max_block` in the pipeline.
    pub async fn run(&mut self, db: Arc<DB>) -> Result<(), PipelineError> {
        loop {
            let mut state = PipelineState {
-                events_sender: self.events_sender.clone(),
+                listeners: self.listeners.clone(),


this looks a bit weird.

but nothing we can do in this PR about it.

looking at self.run_loop(&mut state, seems unnecessary to clone the listeners into the state, at least here, perhaps there are issues when it comes to the stages.

crates/stages/src/pipeline/set.rs

crates/stages/src/sets.rs

This stage was introduced in #972

mattsse

lgtm,
the builder abstraction is very useful indeed

crates/stages/src/sets.rs

These were introduced in #978

crates/stages/src/lib.rs

crates/stages/src/stages/merkle.rs

Co-authored-by: Georgios Konstantopoulos <[email protected]>

gakonst

Love it

crates/stages/src/sets.rs

gakonst · 2023-01-27T16:06:21Z

crates/stages/src/pipeline/set.rs

+        self
+    }
+
+    /// Disables the given stage.


smart! when do you think one would use this?

For us, probably to disable specific indexing stages when in non-archive mode down the line.

For crate consumers, possibly to replace some stages with their own implementation

crates/stages/src/pipeline/set.rs

codecov-commenter · 2023-01-27T16:40:00Z

Codecov Report

Merging #1017 (6a80568) into main (8cfe240) will decrease coverage by 0.37%.
The diff coverage is 38.48%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main    #1017      +/-   ##
==========================================
- Coverage   75.14%   74.78%   -0.37%     
==========================================
  Files         313      317       +4     
  Lines       34299    34592     +293     
==========================================
+ Hits        25775    25869      +94     
- Misses       8524     8723     +199

Flag	Coverage Δ
unit-tests	`74.78% <38.48%> (-0.37%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
bin/reth/src/node/mod.rs	`0.00% <0.00%> (ø)`
bin/reth/src/stage/mod.rs	`0.00% <ø> (ø)`
bin/reth/src/test_eth_chain/runner.rs	`0.00% <ø> (ø)`
crates/stages/src/lib.rs	`100.00% <ø> (ø)`
crates/stages/src/pipeline/set.rs	`0.00% <0.00%> (ø)`
crates/stages/src/sets.rs	`0.00% <0.00%> (ø)`
crates/stages/src/stages/bodies.rs	`91.42% <ø> (-1.54%)`	⬇️
crates/stages/src/stages/execution.rs	`92.32% <ø> (ø)`
crates/stages/src/stages/hashing_account.rs	`93.67% <0.00%> (-1.21%)`	⬇️
crates/stages/src/stages/hashing_storage.rs	`95.23% <0.00%> (-0.86%)`	⬇️
... and 31 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Co-authored-by: Georgios Konstantopoulos <[email protected]>

onbjerg added C-enhancement New feature or request A-staged-sync Related to staged sync (pipelines and stages) labels Jan 24, 2023

onbjerg requested review from rkrasiuk, mattsse, Rjected and gakonst as code owners January 24, 2023 18:20

onbjerg commented Jan 24, 2023

View reviewed changes

crates/net/downloaders/src/headers/linear.rs Show resolved Hide resolved

onbjerg commented Jan 24, 2023

View reviewed changes

crates/stages/src/stages/headers.rs Show resolved Hide resolved

onbjerg commented Jan 24, 2023

View reviewed changes

crates/stages/src/stages/merkle.rs Show resolved Hide resolved

onbjerg commented Jan 24, 2023

View reviewed changes

crates/stages/src/stages/mod.rs Show resolved Hide resolved

Rjected approved these changes Jan 25, 2023

View reviewed changes

mattsse reviewed Jan 25, 2023

View reviewed changes

mattsse mentioned this pull request Jan 27, 2023

Integrate TaskManager in reth main #1036

Closed

onbjerg added 4 commits January 27, 2023 15:58

fix: honor serde feature in reth-network

703a70b

feat: PipelineBuilder

3ed30b9

refactor: smol nit

2d6944b

test: fix port of events in pipeline tests

8973d57

onbjerg force-pushed the onbjerg/pipeline-builder branch from 52eba2a to 8973d57 Compare January 27, 2023 15:15

onbjerg added 2 commits January 27, 2023 16:19

refactor: return PipelineEvent event stream

da5c750

docs: rephrasing

b5306b9

onbjerg requested a review from mattsse January 27, 2023 15:22

onbjerg added 2 commits January 27, 2023 16:25

test: fix doctest

9f5e4aa

chore: add TransactionLookupStage

5f2a7cd

This stage was introduced in #972

mattsse approved these changes Jan 27, 2023

View reviewed changes

crates/stages/src/sets.rs Outdated Show resolved Hide resolved

chore: add history indexing stages

9135943

These were introduced in #978

gakonst reviewed Jan 27, 2023

View reviewed changes

crates/stages/src/lib.rs Show resolved Hide resolved

crates/stages/src/stages/merkle.rs Outdated Show resolved Hide resolved

docs: explain MerkleStage split

50a9d1c

Co-authored-by: Georgios Konstantopoulos <[email protected]>

gakonst approved these changes Jan 27, 2023

View reviewed changes

onbjerg added 2 commits January 27, 2023 17:09

chore: add reth-stages prelude module

0429786

chore: naming

6a80568

onbjerg merged commit ba44c15 into main Jan 27, 2023

onbjerg deleted the onbjerg/pipeline-builder branch January 27, 2023 17:21

onbjerg mentioned this pull request Jan 27, 2023

feat: RethBuilder #963

Closed

Rjected mentioned this pull request Jan 27, 2023

feat(net): test syncing from geth #623

Merged

18 tasks

literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 5, 2023

feat: pipeline builder (paradigmxyz#1017)

d5aac1a

Co-authored-by: Georgios Konstantopoulos <[email protected]>

literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 5, 2023

feat: pipeline builder (paradigmxyz#1017)

b8b0df1

Co-authored-by: Georgios Konstantopoulos <[email protected]>

literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 5, 2023

feat: pipeline builder (paradigmxyz#1017)

aa2bd8b

Co-authored-by: Georgios Konstantopoulos <[email protected]>

literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 5, 2023

feat: pipeline builder (paradigmxyz#1017)

b908ac0

Co-authored-by: Georgios Konstantopoulos <[email protected]>

literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 6, 2023

feat: pipeline builder (paradigmxyz#1017)

234df66

Co-authored-by: Georgios Konstantopoulos <[email protected]>

literallymarvellous pushed a commit to literallymarvellous/reth that referenced this pull request Feb 6, 2023

feat: pipeline builder (paradigmxyz#1017)

c50f4fd

Co-authored-by: Georgios Konstantopoulos <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: pipeline builder #1017

feat: pipeline builder #1017

onbjerg commented Jan 24, 2023 •

edited

Loading

onbjerg Jan 24, 2023

gakonst Jan 27, 2023

onbjerg Jan 24, 2023

onbjerg Jan 24, 2023

rakita Jan 24, 2023

onbjerg Jan 24, 2023

rakita Jan 24, 2023

Rjected left a comment

mattsse Jan 25, 2023

mattsse left a comment

gakonst left a comment •

edited

Loading

gakonst Jan 27, 2023

onbjerg Jan 27, 2023

codecov-commenter commented Jan 27, 2023

feat: pipeline builder #1017

feat: pipeline builder #1017

Conversation

onbjerg commented Jan 24, 2023 • edited Loading

Added

Changed

Removed

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Rjected left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattsse left a comment

Choose a reason for hiding this comment

gakonst left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Jan 27, 2023

Codecov Report

onbjerg commented Jan 24, 2023 •

edited

Loading

gakonst left a comment •

edited

Loading