-
Notifications
You must be signed in to change notification settings - Fork 414
feat(test-benchmark): updates and fixes for fixed opcode count #1985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: forks/amsterdam
Are you sure you want to change the base?
feat(test-benchmark): updates and fixes for fixed opcode count #1985
Conversation
28a0c0c to
8305eab
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## forks/amsterdam #1985 +/- ##
================================================
Coverage 86.33% 86.33%
================================================
Files 538 538
Lines 34557 34557
Branches 3222 3222
================================================
Hits 29835 29835
Misses 4148 4148
Partials 574 574
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
8305eab to
08a6443
Compare
packages/testing/src/execution_testing/cli/pytest_commands/plugins/shared/benchmarking.py
Outdated
Show resolved
Hide resolved
LouisTsai-Csie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments, will continue on more feedback:
- The required precompile benchmarks are already labeled with repricing marker, we only label these with necessary configurations. So we could remove the repricing marker in precompiles benchmark you add.
We need to update the benchmark_parser also, or it would override some of the setting. For example, this is valid in current logic: test_push* but it would be removed in the parser logic, as parser always find all the entry for the fixed-opcode-count compatible test.
f11e602 to
1104a8b
Compare
|
@LouisTsai-Csie I updated the parser and removed some precompiles: Modexp, BLS381, BN128 - as followed on discord. |
a7aa3d9 to
9a59736
Compare
9a59736 to
e6313d6
Compare
LouisTsai-Csie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huge thanks for the help!! I have some comment below and wants to discuss several points:
- For the
READMEfile, could we integrate it into existing docs under the benchmark section?
I have one more files to review, the benchmark_parser.py one, will udpated soon
|
|
||
| The fixed opcode count mode runs benchmark tests with a predetermined number of opcode iterations rather than gas-based limits. This approach enables rapid iteration when analyzing gas costs for repricing proposals, as you can directly compare execution times across different opcode counts. | ||
|
|
||
| **Important:** Tests must be marked with `@pytest.mark.repricing` to be compatible with fixed opcode count mode. This marker identifies tests that have been specifically designed for gas repricing analysis with proper code generators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only partially correct. The fixed-opcode-count feature is not limited to tests marked with the repricing marker.
Q: Which benchmark formats support the fixed-opcode-count feature?
A: Any test that uses the benchmark test wrapper (benchmark_test) together with a code generator (code_generator) can support the fixed-opcode-count feature, since we can generate the required contract logic dynamically.
Q: Why do we use the repricing marker?
A: During gas repricing analysis, Maria is typically interested in a specific subset of configurations. For example, in test_calldatasize, we assume that calldata size is the primary factor affecting performance, so we don’t focus on the zero_data parameter. The repricing marker allows us to configure and narrow down the relevant benchmark cases. We apply this same approach to other tests to limit the scope of benchmark runs.
The current implementation does not restrict the fixed-opcode-count feature to tests labeled with the repricing marker.
Line 210 in 180dcec
| def pytest_collection_modifyitems( |
Summary: atm, we typically run the fixed-opcode-count feature together with the repricing marker, but this is a usage choice, not a technical limitation. From an implementation perspective, fixed-opcode-count benchmarks can run independently of the repricing marker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: below both work
uv run fill -v --clean --gas-benchmark-values 30 tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance -m benchmark
uv run fill -v --clean --gas-benchmark-values 30 tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance -m repricing| ### Generating the Config File | ||
|
|
||
| The benchmark parser tool can automatically generate and update the configuration file by scanning your test modules: | ||
|
|
||
| ```bash | ||
| # Generate or update .fixed_opcode_counts.json | ||
| uv run benchmark_parser | ||
|
|
||
| # Validate that config is in sync | ||
| uv run benchmark_parser --check | ||
| ``` | ||
|
|
||
| The parser preserves any custom counts you've configured while adding new tests with default values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions:
- Move this section before the
Config File Formatsection, so readers are aware of the tool before they start manually adding configuration entries. - Provide more details on the override rules used by
benchmark_parser, so users can understand whether their existing configurations will be preserved or overridden.
| elif self.uses_config_file: | ||
| # Config file mode, no existing params: match against function name | ||
| metafunc.parametrize( | ||
| self.parameter_name, | ||
| self.get_test_parameters(test_name), | ||
| scope="function", | ||
| ) | ||
| else: | ||
| # CLI mode: use function name matching (original behavior) | ||
| metafunc.parametrize( | ||
| self.parameter_name, | ||
| self.get_test_parameters(test_name), | ||
| scope="function", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| elif self.uses_config_file: | |
| # Config file mode, no existing params: match against function name | |
| metafunc.parametrize( | |
| self.parameter_name, | |
| self.get_test_parameters(test_name), | |
| scope="function", | |
| ) | |
| else: | |
| # CLI mode: use function name matching (original behavior) | |
| metafunc.parametrize( | |
| self.parameter_name, | |
| self.get_test_parameters(test_name), | |
| scope="function", | |
| ) | |
| elif self.uses_config_file: | |
| # Config file mode, no existing params: match against function name | |
| metafunc.parametrize( | |
| self.parameter_name, | |
| self.get_test_parameters(test_name), | |
| scope="function", | |
| ) |
The else statement duplicate the logic of the one in elif statement
|
|
||
| # Remove the opcode count part from the test ID for pattern matching | ||
| # Pattern: -opcount_X.XK or -opcount_XK at the end before ] | ||
| import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import re |
Already imported from the top
| # --- 2. Determine Outer Iterations (M) --- | ||
| # The Loop Contract's call count (M) is set to ensure the final total execution is consistent. | ||
| # | ||
| # 2a. If N is 1000: Set M = fixed_opcode_count. (Total ops: fixed_opcode_count * 1000) | ||
| # 2b. If N is 500: Set M = fixed_opcode_count * 2. (Total ops: (fixed_opcode_count * 2) * 500 = fixed_opcode_count * 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After review the logic below, i think the current fallback iteration is not 500 but 250, would be nice to be consistent here. It might be worth checking the entire comment again.
🗒️ Description
This PR fixes several bugs and adds features for
--fixed-opcode-count.Features
-m repricingmarker to all precompile tests, apart from select few not required or broken.test_modfrom repricing as not compatible with foc (fixed-opcode-count).OpcodeCountsConfigpattern matching and validation.0.25= 250 opcodes,0.5= 500 opcodes).Bug Fixes
UsageErrorwhen the fixed opcode count config file (.fixed_opcode_counts.json) is missing.test_bitwise.*AND.*now correctly match against full test names instead oftest_bitwise.*only.Floats As Inputs
For fixed opcode count values < 1.0 (e.g., 0.25 = 250 opcodes), the inner iterations are set to the exact count with
outer = 1, enabling precise low-count benchmarks.Correct Pattern Matching With Config
Pattern matching now works with full test names, given both
"test_bitwise.*": [1]&"test_bitwise.*AND.*": [100]:test_bitwisevariants got 1K, pattern matches function name only.test_bitwise[fork_Prague-opcode_AND]gets 100K, others get 1K, pattern matches simulated full test ID so get different counts per parameter.🔗 Related Issues or PRs
Follow ups and fixes from #1790.
✅ Checklist
toxchecks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:uvx tox -e statictype(scope):.Cute Animal Picture
Venusaur - 0003
