refactor(tests),feat(test-forks): add `filter_combinations` pytest marker to avoid runtime skips by danceratopz · Pull Request #2543 · ethereum/execution-specs

danceratopz · 2026-03-23T14:32:23Z

Description

What?

This PR removes runtime skips that were required in tests that used the CovariantDecorator system provided by the forks plugin due to invalid generated combinations.

Why?

The aim is to reduce unnecessary noise in fill/pytest results.

Details

The approach is to add a filter_combinations pytest marker that deselects parametrized test cases at collection time based on cross-parameter predicates. This replaces runtime pytest.skip() calls that exist because the covariant decorator system can only filter individual parameter values via selector, not combinations across covariant and regular parametrize axes.

The marker accepts an optional reason kwarg shown in collection output so operators can see why combinations were dropped.
The filter runs in pytest_collection_modifyitems after all parametrize expansion is complete, so it sees both covariant and regular parameters uniformly. Deselected items are reported via pytest_deselected for accurate terminal counts.
Print a per-function deselection summary with reasons at default verbosity. Suppress with -q.
Fail with USAGE_ERROR if a predicate eliminates every parametrization of a test function, guarding against overly broad predicates silently dropping all test cases.

Refactored Test Modules

Test module	Function	Before	After	Skips removed
`tests/frontier/create/test_create_suicide_during_init.py`	`test_create_suicide_during_transaction_create`	162 passed, 62 skipped	162 passed, 12 skipped, 50 deselected	50
`tests/prague/eip7702_set_code_tx/test_set_code_txs.py`	`test_set_code_to_self_caller`	36 passed, 12 skipped	36 passed, 12 deselected	12
`tests/prague/eip7702_set_code_tx/test_set_code_txs.py`	`test_set_code_call_set_code`	36 passed, 12 skipped	36 passed, 12 deselected	12
Total		234 passed, 86 skipped	234 passed, 12 skipped, 74 deselected	74

The 12 remaining skips in the frontier create test are unrelated (pre-existing validity marker skips, not from pytest.skip()).

Verification

I validated fixtures before and after this refactor using uv run hasher compare. There was no change in the fixture set.

Before

After

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

codecov · 2026-03-23T14:43:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.35%. Comparing base (174e903) to head (a30ca5f).

Additional details and impacted files

@@               Coverage Diff                @@
##           forks/amsterdam    #2543   +/-   ##
================================================
  Coverage            86.35%   86.35%           
================================================
  Files                  599      599           
  Lines                36904    36904           
  Branches              3771     3771           
================================================
  Hits                 31868    31868           
  Misses                4485     4485           
  Partials               551      551

Flag	Coverage Δ
unittests	`86.35% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

marioevz

Very nice new feature. Just a couple of comments that would be nice to have. Thanks!

marioevz · 2026-03-23T22:15:01Z

+        if not marker.args:
+            continue
+        predicate = marker.args[0]
+        if not predicate(**params):


Add an assertion that predicate is callable, otherwise raise the appropriate error with the test information.

Good idea, thanks! Done!

marioevz · 2026-03-23T22:15:53Z

    items_to_remove = []
+    deselected: List[pytest.Item] = []
+    # function name -> [reason, total, deselected_count]
+    filter_stats: Dict[str, List[Any]] = {}


Suggested change

filter_stats: Dict[str, List[Any]] = {}

filter_stats: Dict[str, Tuple[str, int, int]] = {}

It's a bit nicer than a list.

Add a collection-time marker that deselects parametrized test cases based on cross-parameter predicates, replacing runtime `pytest.skip()` calls for invalid parameter combinations. - Run predicates in `pytest_collection_modifyitems` after all parametrize expansion is complete, so both covariant and regular parameters are visible uniformly. - Report deselected items via `pytest_deselected` for accurate terminal counts. - Print a per-function summary with reasons at default verbosity; expand to individual node IDs at `-vv`. - Fail with `USAGE_ERROR` if a predicate eliminates every parametrization of a test function.

- Replace runtime `pytest.skip()` with a `filter_combinations` marker that deselects non-CREATE opcodes paired with `transaction_create=True` at collection time.

- Replace runtime `pytest.skip()` with `filter_combinations` markers that deselect call opcodes without a `value` kwarg when paired with nonzero value, in both `test_set_code_to_self_caller` and `test_set_code_call_set_code`.

- Verify cross-parameter filtering with covariant and regular parametrize axes. - Verify filtering with only regular parametrize axes. - Verify no-op predicate keeps all combinations. - Verify stacked markers apply AND logic. - Verify predicate that empties a function exits with USAGE_ERROR.

- Assert that the first argument to `filter_combinations` is callable, exiting with `USAGE_ERROR` and the test node ID if not.

- Change filter_stats from Dict[str, List[Any]] to Dict[str, Tuple[str, int, int]] for stronger typing.

marioevz

LGTM, thanks!

…cell in `test_varying_calldata_costs` `STORAGE_CLEAR` refund is zero on revert, so `gas_used_pre_refund == gas_used_post_refund` and the `(post, pre)` interval that `DATA_FLOOR_BETWEEN_TX_GAS_BEFORE_AND_AFTER` requires is empty. Replace the runtime `pytest.skip()` with the `filter_combinations` marker introduced in ethereum#2543, which runs in `pytest_collection_modifyitems` and can express predicates across covariant-marker-injected and inline parametrize axes uniformly. Keeps `with_all_refund_types()` so new refund types added to the fork in future automatically participate, preserves the original three-decorator parametrize stack, and produces byte-identical fixture IDs and content vs the pre-refactor branch (verified via `diff` on `varying_calldata_costs.json` keys and bodies). The two infeasible items per fork are now deselected at collection time rather than skipped at runtime.

…cell in `test_varying_calldata_costs` (#2852) `STORAGE_CLEAR` refund is zero on revert, so `gas_used_pre_refund == gas_used_post_refund` and the `(post, pre)` interval that `DATA_FLOOR_BETWEEN_TX_GAS_BEFORE_AND_AFTER` requires is empty. Replace the runtime `pytest.skip()` with the `filter_combinations` marker introduced in #2543, which runs in `pytest_collection_modifyitems` and can express predicates across covariant-marker-injected and inline parametrize axes uniformly. Keeps `with_all_refund_types()` so new refund types added to the fork in future automatically participate, preserves the original three-decorator parametrize stack, and produces byte-identical fixture IDs and content vs the pre-refactor branch (verified via `diff` on `varying_calldata_costs.json` keys and bodies). The two infeasible items per fork are now deselected at collection time rather than skipped at runtime.

danceratopz added C-refactor Category: refactor A-test-forks Area: execution_testing.forks A-tests Area: Consensus tests. labels Mar 23, 2026

marioevz reviewed Mar 23, 2026

View reviewed changes

danceratopz added 6 commits March 24, 2026 12:18

refactor(tests): use filter_combinations in frontier create tests

e5f5ddc

- Replace runtime `pytest.skip()` with a `filter_combinations` marker that deselects non-CREATE opcodes paired with `transaction_create=True` at collection time.

fix(test-forks): validate filter_combinations predicate is callable

7fa2659

- Assert that the first argument to `filter_combinations` is callable, exiting with `USAGE_ERROR` and the test node ID if not.

refactor(test-forks): use Tuple for filter_stats type hint

a30ca5f

- Change filter_stats from Dict[str, List[Any]] to Dict[str, Tuple[str, int, int]] for stronger typing.

danceratopz force-pushed the delete-skip-in-covariant-decorator branch from 10fe818 to a30ca5f Compare March 24, 2026 11:45

marioevz approved these changes Mar 24, 2026

View reviewed changes

marioevz merged commit 0faeb8b into ethereum:forks/amsterdam Mar 24, 2026
21 checks passed

danceratopz mentioned this pull request Mar 25, 2026

Investigate and potentially eliminate all remaining consensus test skips #2554

Open

5 tasks

danceratopz mentioned this pull request May 13, 2026

refactor(tests): replace runtime skip in EIP-7778 calldata test #2852

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(tests),feat(test-forks): add `filter_combinations` pytest marker to avoid runtime skips#2543

refactor(tests),feat(test-forks): add `filter_combinations` pytest marker to avoid runtime skips#2543
marioevz merged 6 commits into
ethereum:forks/amsterdamfrom
danceratopz:delete-skip-in-covariant-decorator

danceratopz commented Mar 23, 2026

Uh oh!

codecov Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

marioevz left a comment

Uh oh!

marioevz Mar 23, 2026

Uh oh!

danceratopz Mar 24, 2026

Uh oh!

marioevz Mar 23, 2026

Uh oh!

danceratopz Mar 24, 2026

Uh oh!

Uh oh!

marioevz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	filter_stats: Dict[str, List[Any]] = {}
	filter_stats: Dict[str, Tuple[str, int, int]] = {}

Conversation

danceratopz commented Mar 23, 2026

Description

What?

Why?

Details

Refactored Test Modules

Verification

Before

After

✅ Checklist

Cute Animal Picture

Uh oh!

codecov Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

marioevz Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

danceratopz Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

marioevz Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

danceratopz Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

marioevz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Mar 23, 2026 •

edited

Loading