-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve deterministic transaction sorting in TxQ #4077
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. The approach looks good. I appreciate the time you took to try out different ways of re-sorting the queued transactions.
I left a few comments for your consideration. I'll be interested to hear your take on them.
@@ -1740,7 +1775,7 @@ TxQ::getTxRequiredFeeAndSeq( | |||
} | |||
|
|||
std::vector<TxQ::TxDetails> | |||
TxQ::getAccountTxs(AccountID const& account, ReadView const& view) const | |||
TxQ::getAccountTxs(AccountID const& account) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for removing the unused argument.
@@ -1761,7 +1796,7 @@ TxQ::getAccountTxs(AccountID const& account, ReadView const& view) const | |||
} | |||
|
|||
std::vector<TxQ::TxDetails> | |||
TxQ::getTxs(ReadView const& view) const | |||
TxQ::getTxs() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spotting! Thanks for removing the unused argument.
src/ripple/app/misc/impl/TxQ.cpp
Outdated
// std::cerr << "Changing parent hash for ledger " << | ||
// view.info().seq << " from " << parentHash_ << | ||
// " to " << parentHash << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe remove the commented out writes to cerr
or turn it into a JLOG
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh geez. I completely forgot to take those out. Whoops. 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
return (lhs.txID ^ MaybeTx::parentHashComp) < | ||
(rhs.txID ^ MaybeTx::parentHashComp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doxygen comment preceding this change probably needs to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// In the absence of a "re-sort the list in place" function, this | ||
// was the fastest method tried to repopulate the list. | ||
// Other methods included: create a new list and moving items over one at a | ||
// time, create a new list and merge the old list into it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate you trying the various techniques and leaving a comment that you did so. That leaves me less inclined to do my own experiments.
src/test/app/TxQ_test.cpp
Outdated
BEAST_EXPECT(metrics.minProcessingFeeLevel == expectedMin); | ||
BEAST_EXPECT(metrics.medFeeLevel == expectedMed); | ||
using namespace std::string_literals; | ||
BEAST_EXPECTS( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great change. It's really useful for the error message to display the expected vs actual values.
Since we're in here improving error reporting, I'd like to suggest we take it a step further. We could also report the line that the error comes from. Line 55 (this actual line) is not very useful since there are dozens of lines in the tests that call this error checking facility. We'd like to know which calling line had the error.
That can be accomplished by having the caller pass in their line number. It might look something like...
checkMetrics(
int line,
jtx::Env& env,
std::size_t expectedCount,
std::optional<std::size_t> expectedMaxCount,
std::size_t expectedInLedger,
std::size_t expectedPerLedger,
std::uint64_t expectedMinFeeLevel,
std::uint64_t expectedMedFeeLevel = 256 * 500)
Then we explicitly add the line information to the fail()
call, like:
metrics.txCount == expectedCount
? pass()
: fail(
"txCount: "s + std::to_string(metrics.txCount) + "/" +
std::to_string(expectedCount),
__FILE__,
line);
Then callers need to pass the line number in, like...
checkMetrics(__LINE__, env, 0, flagMaxQueue, 0, flagPerLedger, 256);
You can see what I have in mind in the top-most commit here: https://github.com/scottschurr/rippled/commits/ed-txq-order-lcl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great idea, and I'm totally stealing it.
src/test/app/TxQ_test.cpp
Outdated
/* | ||
std::cerr << "Acct: " << tx.txn->at(sfAccount) << | ||
", Seq: " << tx.txn->at(sfSequence) << std::endl; | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why this was put in. But I don't think we want to leave the write to cerr
in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/test/app/TxQ_test.cpp
Outdated
/* | ||
std::cerr << "Queued:" << std::endl; | ||
for (auto const& pair : qTxCount1) | ||
{ | ||
std::cerr << "\t( " << pair.first << ", " << pair.second << | ||
" )" << std::endl; | ||
} | ||
std::cerr << std::endl; | ||
std::cerr << "alice: " << alice.id() << ", seq: " << aliceSeq << | ||
", q: " << qTxCount1[alice.id()] << std::endl; | ||
std::cerr << "bob: " << bob.id() << ", seq: " << bobSeq << | ||
", q: " << qTxCount1[bob.id()] << std::endl; | ||
std::cerr << "charlie: " << charlie.id() << ", seq: " << charlieSeq << | ||
", q: " << qTxCount1[charlie.id()] << std::endl; | ||
std::cerr << "daria: " << daria.id() << ", seq: " << dariaSeq << | ||
", q: " << qTxCount1[daria.id()] << std::endl; | ||
std::cerr << "elmo: " << elmo.id() << ", seq: " << elmoSeq << | ||
", q: " << qTxCount1[elmo.id()] << std::endl; | ||
std::cerr << "fred: " << fred.id() << ", seq: " << fredSeq << | ||
", q: " << qTxCount1[fred.id()] << std::endl; | ||
std::cerr << "gwen: " << gwen.id() << ", seq: " << gwenSeq << | ||
", q: " << qTxCount1[gwen.id()] << std::endl; | ||
std::cerr << "hank: " << hank.id() << ", seq: " << hankSeq << | ||
", q: " << qTxCount1[hank.id()] << std::endl; | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to remove the commented out writes to cerr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/test/app/TxQ_test.cpp
Outdated
/* | ||
std::cerr << "Queued:" << std::endl; | ||
for (auto const& pair : qTxCount1) | ||
{ | ||
std::cerr << "\t( " << pair.first << ", " << pair.second << | ||
" )" << std::endl; | ||
} | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to remove the commented out writes to cerr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/test/app/TxQ_test.cpp
Outdated
/* | ||
std::cerr << "Queued:" << std::endl; | ||
for (auto const& pair : qTxCount2) | ||
{ | ||
std::cerr << "\t( " << pair.first << ", " << pair.second << | ||
" )" << std::endl; | ||
} | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to remove the commented out writes to cerr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks great! Thanks for doing this.
01f56e2
to
a491a84
Compare
Rebased on to 1.8.5 and squashed. |
d63702d
to
69b40a1
Compare
* Txs with the same fee level will sort by TxID XORed with the parent ledger hash. * The TxQ is re-sorted after every ledger. * Attempt to future-proof the TxQ tie breaking test
High Level Overview of Change
This is a follow-up to #4022. That PR added functionality to order the transaction queue by fee descending, then by transaction ID ascending. This PR makes the transaction ordering less predictable by XORing the transaction ID with the parent ledger hash. This is similar to the mechanism used in
CanonicalTXSet
to randomize the order of accounts, so no account has an advantage. This change randomizes the order of transactions paying the same fee, so no transaction has an advantage.After weighing several options, I decided that the simplest solution was to leave the data structure intact and only change the comparison function - basically the same type of change made to transaction queue ordering in #4022. The trick to making it work is to reorder the queue whenever the parent ledger hash changes, which happens in
TxQ::accept
. Again, to keep things simple, the queue data structure (byFee_
) is emptied and rebuilt. That structure is intrusive - the objects are actually scoped in thebyAccount_
structure - so there is zero risk of anything getting dropped. It turns out this method is fast, too.As validators upgrade to the version with this change, if the network is under high load, such that there are a lot of transactions in the queues, their initial proposals will diverge, but that won't be a problem because the issues around transactions not being properly deferred have been fixed.
No amendment is needed for this change.
Type of Change
Test Plan