-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: ABCI methods benchmarks #3904
Conversation
Note Reviews pausedUse the following commands to manage reviews:
📝 Walkthrough📝 WalkthroughWalkthroughThis pull request introduces a suite of benchmark tests for transaction processing functionalities in the Celestia application, focusing on Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (3)
- app/benchmark_test.go (1 hunks)
- pkg/appconsts/initial_consts.go (1 hunks)
- test/util/test_app.go (1 hunks)
🔇 Additional comments not posted (2)
pkg/appconsts/initial_consts.go (1)
14-14
: Verify the implications of doublingDefaultGovMaxSquareSize
.The
DefaultGovMaxSquareSize
has been increased from 64 to 128, which effectively quadruples theDefaultMaxBytes
(since it's calculated as the square ofDefaultGovMaxSquareSize
). This significant increase may have substantial impacts on network performance and resource requirements.Consider the following:
- Does this change align with the PR objectives of adding benchmarks for 8MB message sending?
- Have performance tests been conducted to ensure the network can handle this increased load?
- Are there any downstream effects on other parts of the system that rely on these constants?
To assess the impact, let's check for usages of this constant and related calculations:
Consider adding comments explaining the rationale behind this change and any performance considerations. This will help future maintainers understand the decision-making process.
test/util/test_app.go (1)
191-191
: Ensure this change aligns with benchmark requirementsThe modification to use
cparams.Block.MaxBytes
instead of a constant value forMaxBytes
allows for more flexible configuration of the maximum block size in test environments. This change seems to align with the PR objective of adding benchmarks for various block sizes, including 8MB blocks.However, please verify the following:
- Confirm that this change doesn't affect existing tests that may rely on the previous constant value.
- Ensure that the
cparams
passed to this function in the new benchmark tests are correctly set to achieve the desired block sizes (e.g., 8MB for the 8MB benchmarks).- Consider adding a comment explaining the reason for this change to improve code maintainability.
To ensure this change doesn't introduce unintended side effects, let's check its usage:
✅ Verification successful
Change to use
cparams.Block.MaxBytes
is correctly propagatedThe modification to set
MaxBytes
usingcparams.Block.MaxBytes
intest/util/test_app.go
has been verified across the codebase. All relevant tests correctly configurecparams.Block.MaxBytes
, ensuring that the change aligns with the PR objectives and does not introduce unintended side effects.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for all occurrences of NewTestAppWithGenesisSet to verify if any existing tests might be affected rg --type go "NewTestAppWithGenesisSet" --glob '!test/util/test_app.go' # Search for new benchmark tests that use this function to confirm proper usage rg --type go "NewTestAppWithGenesisSet.*cparams" app/benchmark_test.goLength of output: 357
Script:
#!/bin/bash # Inspect the configuration of cparams.Block.MaxBytes in grpc_query_test.go rg "cparams\.Block\.MaxBytes" x/minfee/grpc_query_test.go # Inspect the configuration of cparams.Block.MaxBytes in tx_size_test.go rg "cparams\.Block\.MaxBytes" app/ante/tx_size_test.goLength of output: 116
Script:
#!/bin/bash # Search for all occurrences of cparams.Block.MaxBytes in the codebase rg --type go "cparams\.Block\.MaxBytes"Length of output: 408
First results:
which contains 31645 msg send transactions in a block totalling 7.4mb
which contains 31645 msg send transactions in a block totalling 7.4mb cc @cmwaters |
Concerning the second part of the issue which requires testing the update client transaction. Does it make sense to instead test a multisig send transaction and have the same number of signatures (150 for example). This would be the same cost if we just want to check how costly it is to verify the signatures. |
Is this just because it's far simpler to do that than create the IBC headers? I think a multi sig can only have at most 7 transactions |
Am I reading this correctly that it takes 6 seconds for prepare proposal and almost 5 seconds for process proposal? (at 8MB of SendMsgs) That's quite a bit. Can you also measure how long it takes to have 8MB worth of send msgs via Do you know why And then finally it would be good to compare this with 8MB worth of blob transactions that are each 4KB in size (number picked at random - I actually don't know what the average PFB size is). |
yes.
Will work on the points and get you the results. |
Ok so that seems like a significant amount of time given we want blocks to be sub 6 seconds. If we're going to limit the gas in a block can you also work out the total gas of those two respective blocks I.e. (all MsgSend and all 4KB PFBs) 🙏 |
For the PFBs, I added the following benchmarks:
|
Same for process proposal, I added
|
Same for check tx, I added
|
Same for deliver tx, I added
|
I will working on the remaining benchmarks and share the results once I have them |
For an 8mb block full of msg send:
For the PFBs, the total gas is in the previous comments. |
|
IBC update client
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app/benchmarks/README.md
Outdated
|
||
## Key takeaways | ||
|
||
We decided to softly limit the number of messages contained in a block, via introducing the `NonPFBTransactionCap` and `PFBTransactionCap`, and checking against them in prepare proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] rename constants if the other PR merges
@@ -0,0 +1,839 @@ | |||
<!-- markdownlint-disable --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] why disable markdownlint for this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it complains about redundant headings all over the document so I decided not to lint it
Co-authored-by: Rootul P <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 17
🧹 Outside diff range and nitpick comments (3)
app/benchmarks/README.md (3)
9-16
: Consider explaining the purpose of the tag.The instructions for running the benchmarks are clear. However, it might be helpful to briefly explain the purpose of the
-tags=bench_abci_methods
flag for users who are unfamiliar with it.
17-19
: Consider adding a brief summary of key results.While referring to a separate document for detailed results is appropriate, it might be helpful to include a brief summary or highlight key findings in this README. This would give readers a quick overview without needing to navigate to another document.
21-27
: Improve clarity and fix grammatical issues in the "Key takeaways" section.The content provides valuable information, but consider the following improvements:
Clarify the purpose of
NonPFBTransactionCap
andPFBTransactionCap
. What do these caps represent?Address the grammatical issues:
- Line 25: Add a comma after "consensus" for better readability.
- Line 27: Consider rephrasing to "As specified in the results document, these results were generated on a 16-core, 48GB RAM machine, which gave us certain thresholds."
- Line 27: Change "when we run" to "when we ran" for consistent past tense.
Consider explaining why the lower numbers from the recommended validator setup were used for the limits. This would provide context for the decision-making process.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~25-~25: Possible missing comma found.
Context: ...hem reached consensus, it will still be accepted since this rule is not consensus breaki...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~27-~27: You might be missing the article “a” here.
Context: ...cument, those results were generated on 16 core 48GB RAM machine, and gave us cert...(AI_EN_LECTOR_MISSING_DETERMINER_A)
[uncategorized] ~27-~27: This verb may not be in the correct tense. Consider changing the tense to fit the context better.
Context: ...us certain thresholds. However, when we run the same experiments on the recommended...(AI_EN_LECTOR_REPLACEMENT_VERB_TENSE)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (4)
- app/benchmarks/README.md (1 hunks)
- app/benchmarks/benchmark_ibc_update_client_test.go (1 hunks)
- app/benchmarks/benchmark_msg_send_test.go (1 hunks)
- app/benchmarks/benchmark_pfb_test.go (1 hunks)
🧰 Additional context used
🪛 LanguageTool
app/benchmarks/README.md
[uncategorized] ~25-~25: Possible missing comma found.
Context: ...hem reached consensus, it will still be accepted since this rule is not consensus breaki...(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~27-~27: You might be missing the article “a” here.
Context: ...cument, those results were generated on 16 core 48GB RAM machine, and gave us cert...(AI_EN_LECTOR_MISSING_DETERMINER_A)
[uncategorized] ~27-~27: This verb may not be in the correct tense. Consider changing the tense to fit the context better.
Context: ...us certain thresholds. However, when we run the same experiments on the recommended...(AI_EN_LECTOR_REPLACEMENT_VERB_TENSE)
🔇 Additional comments (3)
app/benchmarks/README.md (1)
1-8
: LGTM: Clear introduction and benchmark types.The introduction effectively outlines the purpose of the package and clearly lists the types of transactions being benchmarked.
app/benchmarks/benchmark_msg_send_test.go (1)
1-23
: LGTM: Appropriate build tag and imports for benchmark tests.The build tag
bench_abci_methods
is correctly used to isolate these benchmarks. The imports cover all necessary packages for the benchmark tests, including Celestia app components, Cosmos SDK types, and Tendermint ABCI types.app/benchmarks/benchmark_pfb_test.go (1)
156-177
: Verify implementation of helper functionsThe benchmark for
PrepareProposal
looks well-structured. However, it relies on two helper functions that are not visible in this file:calculateBlockSizeInMb
andcalculateTotalGasUsed
. To ensure the accuracy of the reported metrics, we should verify the implementation of these functions.Could you please provide the implementation of
calculateBlockSizeInMb
andcalculateTotalGasUsed
functions? This will help ensure that the metrics are being calculated correctly.✅ Verification successful
Helper functions correctly implemented
The implementations of
calculateBlockSizeInMb
andcalculateTotalGasUsed
accurately perform their intended calculations, ensuring the benchmark metrics are reliable.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for the implementation of helper functions rg -A 10 'func calculateBlockSizeInMb' rg -A 10 'func calculateTotalGasUsed'Length of output: 1377
@coderabbitai pause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 13
🧹 Outside diff range and nitpick comments (3)
app/benchmarks/benchmark_pfb_test.go (1)
1-381
: Summary of improvement suggestions for the benchmark suite
- Utilize
b.N
for iterations in most benchmark functions to improve accuracy and stability of results.- Generate unique blobs for each transaction in
generatePayForBlobTransactions
to better simulate real-world scenarios.- Improve error handling in the binary search of
benchmarkProcessProposalPFBHalfSecond
to provide more context when maximum iterations are reached.These changes will enhance the reliability and realism of the benchmark suite, providing more accurate performance measurements for the Celestia application's handling of Pay For Blob (PFB) transactions.
app/benchmarks/benchmark_ibc_update_client_test.go (2)
132-146
: Consider adding comments to explain the test case valuesThe
testCases
array inBenchmarkIBC_PrepareProposal_Update_Client_Multi
defines specificnumberOfTransactions
andnumberOfValidators
. Adding comments to explain the rationale behind these values and how they impact the benchmarks would enhance code readability and assist future maintainers in understanding the testing scenarios.
184-198
: Add comments to clarify test case configurationsIn
BenchmarkIBC_ProcessProposal_Update_Client_Multi
, thetestCases
array specifiesnumberOfTransactions
andnumberOfValidators
. Providing explanations for the chosen values can help others understand the benchmarking context and the relationship between the number of transactions and validators.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (3)
- app/benchmarks/benchmark_ibc_update_client_test.go (1 hunks)
- app/benchmarks/benchmark_msg_send_test.go (1 hunks)
- app/benchmarks/benchmark_pfb_test.go (1 hunks)
🧰 Additional context used
🔇 Additional comments (11)
app/benchmarks/benchmark_msg_send_test.go (4)
284-308
: Minor improvements for generateMsgSendTransactions helper function.The
generateMsgSendTransactions
function is well-implemented, but consider the following improvements:
- Add error handling for
testutil.SetupTestAppWithGenesisValSetAndMaxSquareSize
.- Consider using a constant for the
MaxSquareSize
parameter (128).- Use a named return value for better documentation.
Here's a suggested improvement:
-func generateMsgSendTransactions(b *testing.B, count int) (*app.App, [][]byte) { +func generateMsgSendTransactions(b *testing.B, count int) (testApp *app.App, rawTxs [][]byte) { + const maxSquareSize = 128 account := "test" - testApp, kr := testutil.SetupTestAppWithGenesisValSetAndMaxSquareSize(app.DefaultConsensusParams(), 128, account) + var err error + testApp, kr, err := testutil.SetupTestAppWithGenesisValSetAndMaxSquareSize(app.DefaultConsensusParams(), maxSquareSize, account) + require.NoError(b, err) addr := testfactory.GetAddress(kr, account) enc := encoding.MakeConfig(app.ModuleEncodingRegisters...) acc := testutil.DirectQueryAccount(testApp, addr) signer, err := user.NewSigner(kr, enc.TxConfig, testutil.ChainID, appconsts.LatestVersion, user.NewAccount(account, acc.GetAccountNumber(), acc.GetSequence())) require.NoError(b, --- `118-139`: _:hammer_and_wrench: Refactor suggestion_ **Optimize PrepareProposal benchmark for 8MB of transactions.** The current implementation can be improved for more accurate results: 1. Run the benchmark for `b.N` iterations. 2. Consider adding a throughput metric using `b.SetBytes()`. Here's a suggested improvement: ```diff func BenchmarkPrepareProposal_MsgSend_8MB(b *testing.B) { // a full 8mb block equals to around 31645 msg send transactions. // using 31645 to let prepare proposal choose the maximum testApp, rawTxs := generateMsgSendTransactions(b, 31645) blockData := &tmproto.Data{ Txs: rawTxs, } prepareProposalRequest := types.RequestPrepareProposal{ BlockData: blockData, ChainId: testApp.GetChainID(), Height: 10, } + totalBytes := int64(0) + for _, tx := range rawTxs { + totalBytes += int64(len(tx)) + } + b.SetBytes(totalBytes) b.ResetTimer() - resp := testApp.PrepareProposal(prepareProposalRequest) - b.StopTimer() - require.GreaterOrEqual(b, len(resp.BlockData.Txs), 1) - b.ReportMetric(float64(len(resp.BlockData.Txs)), "number_of_transactions") - b.ReportMetric(calculateBlockSizeInMb(resp.BlockData.Txs), "block_size(mb)") - b.ReportMetric(float64(calculateTotalGasUsed(testApp, resp.BlockData.Txs)), "total_gas_used") + for i := 0; i < b.N; i++ { + resp := testApp.PrepareProposal(prepareProposalRequest) + require.GreaterOrEqual(b, len(resp.BlockData.Txs), 1) + b.ReportMetric(float64(len(resp.BlockData.Txs)), "number_of_transactions") + b.ReportMetric(calculateBlockSizeInMb(resp.BlockData.Txs), "block_size(mb)") + b.ReportMetric(float64(calculateTotalGasUsed(testApp, resp.BlockData.Txs)), "total_gas_used") + } }These changes will provide more accurate benchmarking results by running for multiple iterations and reporting throughput in bytes per second.
Likely invalid or redundant comment.
65-78
: 🛠️ Refactor suggestionImprove benchmark accuracy for DeliverTx with a single transaction.
The current implementation only runs the benchmark once. To get more accurate results:
- Move
b.ResetTimer()
just before the operation being measured.- Run the benchmark for
b.N
iterations.Here's a suggested improvement:
func BenchmarkDeliverTx_MsgSend_1(b *testing.B) { testApp, rawTxs := generateMsgSendTransactions(b, 1) deliverTxRequest := types.RequestDeliverTx{ Tx: rawTxs[0], } b.ResetTimer() - resp := testApp.DeliverTx(deliverTxRequest) - b.StopTimer() - require.Equal(b, uint32(0), resp.Code) - require.Equal(b, "", resp.Codespace) - b.ReportMetric(float64(resp.GasUsed), "gas_used") + for i := 0; i < b.N; i++ { + resp := testApp.DeliverTx(deliverTxRequest) + require.Equal(b, uint32(0), resp.Code) + require.Equal(b, "", resp.Codespace) + b.ReportMetric(float64(resp.GasUsed), "gas_used") + } }This change ensures that the benchmark runs for the correct number of iterations and only measures the time taken by
DeliverTx
.Likely invalid or redundant comment.
141-173
: 🛠️ Refactor suggestionImprove ProcessProposal benchmark accuracy for a single transaction.
The current implementation only runs the benchmark once. To get more accurate results:
- Run the benchmark for
b.N
iterations.- Consider adding a throughput metric using
b.SetBytes()
.Here's a suggested improvement:
func BenchmarkProcessProposal_MsgSend_1(b *testing.B) { testApp, rawTxs := generateMsgSendTransactions(b, 1) blockData := &tmproto.Data{ Txs: rawTxs, } prepareProposalRequest := types.RequestPrepareProposal{ BlockData: blockData, ChainId: testApp.GetChainID(), Height: 10, } prepareProposalResponse := testApp.PrepareProposal(prepareProposalRequest) require.GreaterOrEqual(b, len(prepareProposalResponse.BlockData.Txs), 1) processProposalRequest := types.RequestProcessProposal{ BlockData: prepareProposalResponse.BlockData, Header: tmproto.Header{ Height: 1, DataHash: prepareProposalResponse.BlockData.Hash, ChainID: testutil.ChainID, Version: version.Consensus{ App: testApp.AppVersion(), }, }, } + b.SetBytes(int64(len(rawTxs[0]))) b.ResetTimer() - resp := testApp.ProcessProposal(processProposalRequest) - b.StopTimer() - require.Equal(b, types.ResponseProcessProposal_ACCEPT, resp.Result) - - b.ReportMetric(float64(calculateTotalGasUsed(testApp, prepareProposalResponse.BlockData.Txs)), "total_gas_used") + for i := 0; i < b.N; i++ { + resp := testApp.ProcessProposal(processProposalRequest) + require.Equal(b, types.ResponseProcessProposal_ACCEPT, resp.Result) + b.ReportMetric(float64(calculateTotalGasUsed(testApp, prepareProposalResponse.BlockData.Txs)), "total_gas_used") + } }These changes will provide more accurate benchmarking results by running for multiple iterations and reporting throughput in bytes per second.
Likely invalid or redundant comment.
app/benchmarks/benchmark_pfb_test.go (7)
5-25
: Import statements look goodThe necessary packages for benchmarking, crypto operations, and Celestia-specific modules are correctly imported. There are no unused imports.
27-29
: Initialization looks goodSetting
TestAppLogger
to a no-op logger in theinit
function is a good practice for benchmarks. This prevents logging overhead from affecting the benchmark results.
60-76
: Function structure looks good, but iteration issue was addressed in previous comment
107-125
: Function structure looks good, but iteration issue was addressed in previous comment
156-177
: Function structure looks good, but iteration issue was addressed in previous comment
208-245
: Function structure looks good, but iteration issue was addressed in previous comment
247-269
: Benchmark structure is appropriate for its purposeThe
BenchmarkProcessProposal_PFB_Half_Second
function is well-structured for its specific purpose of finding the maximum number of transactions that can be processed within half a second. It doesn't useb.N
for iterations, which is appropriate in this case as it's measuring against a specific time constraint rather than running for a fixed number of iterations.
# Conflicts: # pkg/appconsts/global_consts.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM, very thorough work!
// megabyte the number of bytes in a megabyte | ||
const megabyte = 1048576 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handles the renaming feedback in #3904 (comment)
Works on this: #3898 To run the benchmarks, run the following in the root directory: ```shell go test -bench=<benchmark_name> app/benchmark_test.go ``` with `<benchmark_name>` is: - `BenchmarkCheckTx_MsgSend_1`: for a benchmark of a `MsgSend` transaction in `CheckTx` - `BenchmarkCheckTx_MsgSend_8MB`: for a benchmark of 8mb block worth of `MsgSend` transactions in `CheckTx` - `BenchmarkDeliverTx_MsgSend_1`: for a benchmark of a `MsgSend` transaction in `DeliverTx` - `BenchmarkDeliverTx_MsgSend_8MB`: for a benchmark of 8mb block worth of `MsgSend` transactions in `DeliverTx` - `BenchmarkPrepareProposal_MsgSend_1`: for a benchmark of a block containing a single`MsgSend` transaction in `PrepareProposal` - `BenchmarkPrepareProposal_MsgSend_8MB`: for a benchmark of an 8mb block containing `MsgSend` transactions in `PrepareProposal` - `BenchmarkProcessProposal_MsgSend_1`: for a benchmark of a block containing a single`MsgSend` transaction in `ProcessProposal` - `BenchmarkProcessProposal_MsgSend_8MB`: for a benchmark of an 8mb block containing`MsgSend` transactions in `ProcessProposal` - ... Note: keeping this as a draft because it doesn't necessarily need to be merged Benchmark run on: Macbook pro M3 max 48GB RAM --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Rootul P <[email protected]> (cherry picked from commit d4eb75e)
Handles the renaming feedback in #3904 (comment) (cherry picked from commit 0ad4d72)
Works on this: #3898
To run the benchmarks, run the following in the root directory:
with
<benchmark_name>
is:BenchmarkCheckTx_MsgSend_1
: for a benchmark of aMsgSend
transaction inCheckTx
BenchmarkCheckTx_MsgSend_8MB
: for a benchmark of 8mb block worth ofMsgSend
transactions inCheckTx
BenchmarkDeliverTx_MsgSend_1
: for a benchmark of aMsgSend
transaction inDeliverTx
BenchmarkDeliverTx_MsgSend_8MB
: for a benchmark of 8mb block worth ofMsgSend
transactions inDeliverTx
BenchmarkPrepareProposal_MsgSend_1
: for a benchmark of a block containing a singleMsgSend
transaction inPrepareProposal
BenchmarkPrepareProposal_MsgSend_8MB
: for a benchmark of an 8mb block containingMsgSend
transactions inPrepareProposal
BenchmarkProcessProposal_MsgSend_1
: for a benchmark of a block containing a singleMsgSend
transaction inProcessProposal
BenchmarkProcessProposal_MsgSend_8MB
: for a benchmark of an 8mb block containingMsgSend
transactions inProcessProposal
Note: keeping this as a draft because it doesn't necessarily need to be merged
Benchmark run on: Macbook pro M3 max 48GB RAM