Skip to content

Conversation

@spencer-tb
Copy link
Contributor

@spencer-tb spencer-tb commented Jan 7, 2026

🗒️ Description

This PR fixes several bugs and adds features for --fixed-opcode-count.

Features

  • Use longest match for pattern specificity with regex, more detail below.
  • Add -m repricing marker to all precompile tests, apart from select few not required or broken.
  • Remove test_mod from repricing as not compatible with foc (fixed-opcode-count).
  • Adds unit tests for OpcodeCountsConfig pattern matching and validation.
  • Support float values for sub-1K opcode counts (e.g., 0.25 = 250 opcodes, 0.5 = 500 opcodes).

Bug Fixes

  • Raise UsageError when the fixed opcode count config file (.fixed_opcode_counts.json) is missing.
  • Validate flag input to reject test paths accidentally consumed by argparse.
  • Major - Fix config file pattern matching for parametrized tests, patterns like test_bitwise.*AND.* now correctly match against full test names instead of test_bitwise.* only.

Floats As Inputs

For fixed opcode count values < 1.0 (e.g., 0.25 = 250 opcodes), the inner iterations are set to the exact count with outer = 1, enabling precise low-count benchmarks.

Correct Pattern Matching With Config

Pattern matching now works with full test names, given both "test_bitwise.*": [1] & "test_bitwise.*AND.*": [100]:

  • Before: All test_bitwise variants got 1K, pattern matches function name only.
  • After: test_bitwise[fork_Prague-opcode_AND] gets 100K, others get 1K, pattern matches simulated full test ID so get different counts per parameter.

🔗 Related Issues or PRs

Follow ups and fixes from #1790.

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md.
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

Venusaur - 0003
Put a link to a cute animal picture inside the parenthesis-->

@spencer-tb spencer-tb added C-bug Category: this is a bug, deviation, or other problem C-feat Category: an improvement or new feature P-high A-test-benchmark Area: execution_testing.benchmark and tests/benchmark labels Jan 7, 2026
@spencer-tb spencer-tb force-pushed the feat/fixed-opcode-count-updates branch 2 times, most recently from 28a0c0c to 8305eab Compare January 7, 2026 15:19
@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.33%. Comparing base (858336e) to head (b62cf1a).
⚠️ Report is 6 commits behind head on forks/amsterdam.

Additional details and impacted files
@@               Coverage Diff                @@
##           forks/amsterdam    #1985   +/-   ##
================================================
  Coverage            86.33%   86.33%           
================================================
  Files                  538      538           
  Lines                34557    34557           
  Branches              3222     3222           
================================================
  Hits                 29835    29835           
  Misses                4148     4148           
  Partials               574      574           
Flag Coverage Δ
unittests 86.33% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@spencer-tb spencer-tb force-pushed the feat/fixed-opcode-count-updates branch from 8305eab to 08a6443 Compare January 7, 2026 21:31
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, will continue on more feedback:

  • The required precompile benchmarks are already labeled with repricing marker, we only label these with necessary configurations. So we could remove the repricing marker in precompiles benchmark you add.

We need to update the benchmark_parser also, or it would override some of the setting. For example, this is valid in current logic: test_push* but it would be removed in the parser logic, as parser always find all the entry for the fixed-opcode-count compatible test.

@spencer-tb spencer-tb force-pushed the feat/fixed-opcode-count-updates branch 2 times, most recently from f11e602 to 1104a8b Compare January 8, 2026 12:37
@spencer-tb
Copy link
Contributor Author

@LouisTsai-Csie I updated the parser and removed some precompiles: Modexp, BLS381, BN128 - as followed on discord.

@spencer-tb spencer-tb force-pushed the feat/fixed-opcode-count-updates branch from 9a59736 to e6313d6 Compare January 13, 2026 14:18
Copy link
Collaborator

@LouisTsai-Csie LouisTsai-Csie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge thanks for the help!! I have some comment below and wants to discuss several points:

  • For the README file, could we integrate it into existing docs under the benchmark section?

I have one more files to review, the benchmark_parser.py one, will udpated soon


The fixed opcode count mode runs benchmark tests with a predetermined number of opcode iterations rather than gas-based limits. This approach enables rapid iteration when analyzing gas costs for repricing proposals, as you can directly compare execution times across different opcode counts.

**Important:** Tests must be marked with `@pytest.mark.repricing` to be compatible with fixed opcode count mode. This marker identifies tests that have been specifically designed for gas repricing analysis with proper code generators.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only partially correct. The fixed-opcode-count feature is not limited to tests marked with the repricing marker.

Q: Which benchmark formats support the fixed-opcode-count feature?
A: Any test that uses the benchmark test wrapper (benchmark_test) together with a code generator (code_generator) can support the fixed-opcode-count feature, since we can generate the required contract logic dynamically.

Q: Why do we use the repricing marker?
A: During gas repricing analysis, Maria is typically interested in a specific subset of configurations. For example, in test_calldatasize, we assume that calldata size is the primary factor affecting performance, so we don’t focus on the zero_data parameter. The repricing marker allows us to configure and narrow down the relevant benchmark cases. We apply this same approach to other tests to limit the scope of benchmark runs.

The current implementation does not restrict the fixed-opcode-count feature to tests labeled with the repricing marker.

Summary: atm, we typically run the fixed-opcode-count feature together with the repricing marker, but this is a usage choice, not a technical limitation. From an implementation perspective, fixed-opcode-count benchmarks can run independently of the repricing marker.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example: below both work

uv run fill -v --clean --gas-benchmark-values 30 tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance -m benchmark

uv run fill -v --clean --gas-benchmark-values 30 tests/benchmark/compute/instruction/test_account_query.py::test_selfbalance -m repricing

Comment on lines +107 to +119
### Generating the Config File

The benchmark parser tool can automatically generate and update the configuration file by scanning your test modules:

```bash
# Generate or update .fixed_opcode_counts.json
uv run benchmark_parser

# Validate that config is in sync
uv run benchmark_parser --check
```

The parser preserves any custom counts you've configured while adding new tests with default values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions:

  • Move this section before the Config File Format section, so readers are aware of the tool before they start manually adding configuration entries.
  • Provide more details on the override rules used by benchmark_parser, so users can understand whether their existing configurations will be preserved or overridden.

Comment on lines +290 to +303
elif self.uses_config_file:
# Config file mode, no existing params: match against function name
metafunc.parametrize(
self.parameter_name,
self.get_test_parameters(test_name),
scope="function",
)
else:
# CLI mode: use function name matching (original behavior)
metafunc.parametrize(
self.parameter_name,
self.get_test_parameters(test_name),
scope="function",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elif self.uses_config_file:
# Config file mode, no existing params: match against function name
metafunc.parametrize(
self.parameter_name,
self.get_test_parameters(test_name),
scope="function",
)
else:
# CLI mode: use function name matching (original behavior)
metafunc.parametrize(
self.parameter_name,
self.get_test_parameters(test_name),
scope="function",
)
elif self.uses_config_file:
# Config file mode, no existing params: match against function name
metafunc.parametrize(
self.parameter_name,
self.get_test_parameters(test_name),
scope="function",
)

The else statement duplicate the logic of the one in elif statement


# Remove the opcode count part from the test ID for pattern matching
# Pattern: -opcount_X.XK or -opcount_XK at the end before ]
import re
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import re

Already imported from the top

Comment on lines 194 to 198
# --- 2. Determine Outer Iterations (M) ---
# The Loop Contract's call count (M) is set to ensure the final total execution is consistent.
#
# 2a. If N is 1000: Set M = fixed_opcode_count. (Total ops: fixed_opcode_count * 1000)
# 2b. If N is 500: Set M = fixed_opcode_count * 2. (Total ops: (fixed_opcode_count * 2) * 500 = fixed_opcode_count * 1000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After review the logic below, i think the current fallback iteration is not 500 but 250, would be nice to be consistent here. It might be worth checking the entire comment again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-test-benchmark Area: execution_testing.benchmark and tests/benchmark C-bug Category: this is a bug, deviation, or other problem C-feat Category: an improvement or new feature P-high

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants