Skip to content

op-node: Switch channel bank and channel in reader to a pull based stage#3596

Merged
mslipper merged 5 commits intojg/l1retrievalfrom
jg/channel_in_reader
Sep 28, 2022
Merged

op-node: Switch channel bank and channel in reader to a pull based stage#3596
mslipper merged 5 commits intojg/l1retrievalfrom
jg/channel_in_reader

Conversation

@trianglesphere
Copy link
Contributor

Description

This is two commits because the intermediate commit was failing a test & the second commit fixes it.

  • op-node: Switch channel bank to be pull based

    This again requires a fair amount of changes to channel_in_reader.go
    for the channel in reader to maintain its progress state.

  • op-node: Switch channel in reader to a pull based stage
    Like the rest of the changes, this also required modifications to
    the next stage - the batch queue in order for it to manage the
    progress API. This commit required even more changes than usual.
    I changed the pipeline to be reset to a common starting point
    and now use the L2SafeHead block to filter out adding batches &
    L1 blocks to the batch queue.

Metadata

  • Fixes ENG-2749

@changeset-bot
Copy link

changeset-bot bot commented Sep 28, 2022

⚠️ No Changeset found

Latest commit: 32f09d1

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@mergify
Copy link
Contributor

mergify bot commented Sep 28, 2022

This PR changes implementation code, but doesn't include a changeset. Did you forget to add one?

@trianglesphere trianglesphere marked this pull request as ready for review September 28, 2022 21:12
Copy link
Contributor

@protolambda protolambda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some comments about the interesting bits (to me), but nothing blocking merge, nice work 👍

Comment on lines +443 to +446
} else if err != nil && errors.Is(err, derive.NotEnoughData) {
stepAttempts = 0 // don't do a backoff for this error
reqStep()
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite subtle, but preserves one of the best things from the old design I think was most undervalued in redesign work.

While the pipeline is deriving, a stage can choose to stop and return this error, handing back control to the caller (driver). This keeps the derivation pipeline nice and synchronous for Cannon usage, while handling cases where there's a slow background fetching thing or something requiring repeated smaller updates (like here with the channels). And thus we can put a short and strict timeout on the pipeline work, while keeping the whole thing snappy and simple.

The alternative would be to make the pipeline blocking and essentially halt until we pull either a batch or an EOF, but then it would hold the Engine control captive unless we decouple that further (and deal with queuing and async stuff there without breaking Cannon compatibility), which I think isn't worth it (but maybe with a good design can be done eventually at a later point, since it's not a breaking protocol change).

cc @norswap this is related to thing we started to discuss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error was mainly introduced to simplify some of the code (so it doesn't need to do a loop inside a function). Honestly, I'm really skeptical about the benefit of yielding back to the caller - it tends to introduce a lot of extra internal state to be able to resume and I'm not sure what the benefit of this is. Having a block pipeline is really simple and easy to understand. In addition, because the fetcher is just an interface, it's really easy to sub out for cannon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pipeline blocking means we can't interact with the engine from the driver without knowing if we interfere with the engine state or if things change while we use the engine, handing back control to the caller to not lock the engine is important IMHO. Agreed on the blocking being easier, but I don't believe we should block on something too long when it wraps a resource as important as the engine.

This again requires a fair amount of changes to channel_in_reader.go
for the channel in reader to maintain its progress state.
Like the rest of the changes, this also required modifications to
the next stage - the batch queue in order for it to manage the
progress API. This commit required even more changes than usual.
I changed the pipeline to be reset to a common starting point
and now use the L2SafeHead block to filter out adding batches &
L1 blocks to the batch queue.
The attributes queue actually had pretty few modifications to work
with the progress API. The logic of switching the batch queue over
was a bit more complex because the batch queue is very stateful, but
still not the worst.
trianglesphere and others added 2 commits September 28, 2022 22:52
The progress API is very nearly removed from the engine queue stage.
op-node: Switch batch queue to be pull based
@mslipper mslipper merged commit c77bf2b into jg/l1retrieval Sep 28, 2022
@mslipper mslipper deleted the jg/channel_in_reader branch September 28, 2022 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants