-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove go routines for RecheckTx #1553
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow totally did not realize v1 was doing this
@@ -713,34 +708,23 @@ func (txmp *TxMempool) recheckTransactions() { | |||
|
|||
// Issue CheckTx calls for each remaining transaction, and when all the | |||
// rechecks are complete signal watchers that transactions may be available. | |||
go func() { | |||
g, start := taskgroup.New(nil).Limit(2 * runtime.NumCPU()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it would be possible to avoid this issue without blocking by simply not using a taskgroup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be watertight it should be blocking over the period that state is updating else as this is still a go routine, transactions may be submitted before updating to the latest nonce with the checktxstate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the mempool locked while the goroutine is still running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could happen in practice that some of these CheckTx transactions happen while the mempool is locked but it's not guaranteed
return nil | ||
}) | ||
for _, wtx := range wtxs { | ||
wtx := wtx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not necessary any more b/c this repo uses Go 1.23 and Go 1.22 included:
Previously, the variables declared by a “for” loop were created once and updated by each iteration. In Go 1.22, each iteration of the loop creates new variables, to avoid accidental sharing bugs. The transition support tooling described in the proposal continues to work in the same way it did in Go 1.21.
Ref: https://tip.golang.org/doc/go1.22
wtx := wtx |
oops disregard, I see this was just how it was previously implemented
Upgrade to https://github.com/celestiaorg/celestia-core/releases/tag/v1.44.1-tm-v0.34.35 to pick up this fix celestiaorg/celestia-core#1553 (cherry picked from commit 9f3b196)
This is residual of #1553 The problem is now even more subtle. Because the mempool mutexes weren't over both CheckTx and the actual adding of the transaction to the mempool we occasionally hit a situation as follows: - CheckTx a tx with nonce 2 - Before saving it to the store, collect all transactions and recheck tx. This excludes the last tx with nonce 2, thus the nonce in the state machine is still 1 - Save the tx to the pool - New tx comes in with nonce 3. The application is at 1 so rejects it expecting the next to be 2. This PR fixes this pattern, however this won't be watertight for the CAT pool until we can order both on gas fee (priority) and nonce. --------- Co-authored-by: Rootul Patel <[email protected]>
This PR makes two small tweaks: - Fixes `TestConcurrentTxSubmission` by adding a capacity of 1 to the errCh. Currently errors were being ignored because the wait group meant that there wasn't a process to read to the channel as it was being written to. This fixes this - Catches the case where a user cancels the context when calling `ConfirmTx` **This test is broken until celestiaorg/celestia-core#1553 is resolved** --------- Co-authored-by: nina / ნინა <[email protected]>
This PR makes two small tweaks: - Fixes `TestConcurrentTxSubmission` by adding a capacity of 1 to the errCh. Currently errors were being ignored because the wait group meant that there wasn't a process to read to the channel as it was being written to. This fixes this - Catches the case where a user cancels the context when calling `ConfirmTx` **This test is broken until celestiaorg/celestia-core#1553 is resolved** --------- Co-authored-by: nina / ნინა <[email protected]> (cherry picked from commit f21716b)
This PR makes two small tweaks: - Fixes `TestConcurrentTxSubmission` by adding a capacity of 1 to the errCh. Currently errors were being ignored because the wait group meant that there wasn't a process to read to the channel as it was being written to. This fixes this - Catches the case where a user cancels the context when calling `ConfirmTx` **This test is broken until celestiaorg/celestia-core#1553 is resolved**<hr>This is an automatic backport of pull request #4104 done by [Mergify](https://mergify.com). Co-authored-by: Callum Waters <[email protected]>
Closes: #1552