Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle a race condition between the signer and the /v2/pox endpoint #4738

Merged
merged 6 commits into from
May 3, 2024

Conversation

obycode
Copy link
Contributor

@obycode obycode commented May 1, 2024

When the signer is processing a new burn block, it may hit the /v2/pox endpoint before it has been updated with the latest block. This change will check for this case and retry until it receives the expected cycle.

This also fixes a problem with refreshing the signer too early. It should only attempt to refresh the signer during the next prepare phase. This handles the case where for the first block in a cycle (height % cycle_length == 0), it will report that it is in cycle N, but it will also report that it is in the prepare phase. This was resulting in refreshing the signer config too early. For example, with a cycle length of 20, at block 160, we would see a log:

Received a new burnchain block height (160) in the prepare phase of the next reward cycle (9). Checking for signer registration...

This is incorrect, because block 160 is not in the prepare phase for cycle 9.

When the signer is processing a new burn block, it may hit the /v2/pox
endpoint before it has been updated with the latest block. This change
will check for this case and retry until it receives the expected cycle.
This handles the case where for the first block in a cycle (height %
cycle_length == 0), it will report that it is in cycle N, but it will
also report that it is in the prepare phase. This was resulting in
refreshing the signer config too early. For example, with a cycle length
of 20, at block 160, we would see a log:

```
Received a new burnchain block height (160) in the prepare phase of the
next reward cycle (9). Checking for signer registration...
```

This is incorrect, because block 160 is not in the prepare phase for
cycle 9.
@obycode obycode requested review from jferrant and hstove May 1, 2024 14:16
@obycode
Copy link
Contributor Author

obycode commented May 1, 2024

I got a little excited and pushed too soon. Will fix and post here when it is ready for review.

@obycode obycode force-pushed the fix/signer-cycle-transition branch from faae28d to 40b1603 Compare May 1, 2024 15:09
jferrant
jferrant previously approved these changes May 1, 2024
@obycode
Copy link
Contributor Author

obycode commented May 1, 2024

Ok, I think the last change is good. I am testing now locally.

@obycode
Copy link
Contributor Author

obycode commented May 1, 2024

Something is not working still with this yet. The signers are never recognizing that they are registered. Investigating now.

@obycode obycode marked this pull request as draft May 1, 2024 17:45
@obycode obycode marked this pull request as ready for review May 1, 2024 21:07
@obycode obycode requested a review from jferrant May 1, 2024 21:07
@obycode
Copy link
Contributor Author

obycode commented May 1, 2024

Ok, ready for review again. 🙏

Copy link
Contributor

@hstove hstove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@saralab saralab requested a review from kantai May 3, 2024 13:50
Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- can you create an issue for testing this scenario? I think that we want to make sure in a realistic testing environment that signers are able to sign new blocks during the 0th tenure of a reward cycle. If there's a race condition in the /v2/pox endpoint which prevents it from updating on time, its possible it could trigger "off by one" like behavior for the 0th tenure. I think this patch fixes that, but its something we want to be sure to confirm.

@obycode obycode added this pull request to the merge queue May 3, 2024
Merged via the queue into develop with commit 54ae125 May 3, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants