Forward extra args to test-integration hatch script#2682
Merged
Conversation
Allow `hatch run tests.<env>:test-integration -k <name>` (and other
pytest flags) to filter or otherwise customize the integration run
without editing scripts/test/integration.sh.
`hatch run` does not forward extra args to a script unless the script
definition contains the `{args}` placeholder. The script itself must
also accept the forwarded args via `"$@"`. Without both pieces in
place, attempts to filter with `-k` or `PYTEST_ADDOPTS` are silently
ignored -- the env-var approach in particular fails because
integration.sh already passes its own `-k` exclusion list and pytest
only honors the last `-k`.
This change adds `{args}` to the pyproject script definition, `"$@"`
to the pytest invocation in integration.sh, and documents the new
filtering workflow in the contributor guide. CI behavior is unchanged
because CI invokes the script without extra args.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR enables forwarding additional pytest arguments through the Hatch test-integration script, making local integration-test iteration more flexible.
Changes:
- Adds Hatch
{args}forwarding fortest-integration. - Passes shell script positional arguments through to
pytest. - Documents the workflow for filtering integration tests and passing pytest flags.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
scripts/test/integration.sh |
Forwards received script arguments to pytest. |
pyproject.toml |
Enables Hatch to pass extra arguments into the integration test script. |
docs/policy/contributing.rst |
Documents passing pytest flags and test selectors to integration tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4 tasks
pankajastro
approved these changes
May 15, 2026
pankajkoti
approved these changes
May 15, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2682 +/- ##
==========================================
+ Coverage 98.03% 98.07% +0.03%
==========================================
Files 105 105
Lines 7843 7843
==========================================
+ Hits 7689 7692 +3
+ Misses 154 151 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
pankajastro
added a commit
that referenced
this pull request
May 18, 2026
This reverts commit ad5e11c.
tatiana
added a commit
that referenced
this pull request
May 19, 2026
Closes #2698 Closes [BOSS-401](https://linear.app/astronomer/issue/BOSS-401/cosmos-watcher-mode-does-not-retry-downstream-models-upon-successful) (Customer issue). > **Note:** This PR is stacked on top of #2682 (forward extra args to `test-integration`). The first commit on this branch is a cherry-pick of that change — drop it on rebase once #2682 merges. ## The problem When a watcher producer's first `dbt build` attempt fails, dbt marks the failing model and every transitive downstream node with `node_status="skipped"`. The producer log parser pushed those `"skipped"` statuses straight to XCom, so each downstream consumer sensor raised `AirflowSkipException` and ended in SKIPPED. Airflow does not retry SKIPPED tasks — so even after the failing upstream recovered via its own consumer-retry fallback, the downstream model was never re-run. The DAG completed `success` (Airflow treats SKIPPED as non-failure) with the downstream tables silently un-materialized. Luis Carbonell (Prodege) reported the "false green" symptom on `astronomer-cosmos==1.14.1` and `Airflow 3.2.1+astro.2`. Reproduction setup: - `model_a` → succeeds. - `model_flaky` → fails on first attempt via a Postgres sequence pre-hook, succeeds on subsequent attempts. - `model_downstream` → depends on `model_flaky`. Without the fix, on retry: - ✅ `model_flaky` recovers via the consumer's `_fallback_to_non_watcher_run` (existing retry path for FAILED consumers). - ❌ `model_downstream` stays SKIPPED — Airflow never retries it. - DAG: `success`. ## The fix `store_dbt_resource_status_from_log` in `cosmos/operators/_watcher/base.py` now tracks unique_ids that dbt emits `SkippingDetails` (non-ephemeral upstream) or `LogSkipBecauseError` (ephemeral upstream) events for. Later `"skipped"` terminal events for those unique_ids are rewritten to `"failed"` so the downstream consumer fails on attempt 1, Airflow retries it, and the existing `_fallback_to_non_watcher_run` path runs the model locally once its upstream has recovered. Why these two events are the right discriminator (traced through dbt source on `dbt-core==1.11`): - `SkippingDetails` and `LogSkipBecauseError` are fired only from `BaseRunner.on_skip()` (`dbt/task/base.py:420-471`). - `on_skip()` is reached only when `self.skip = True`. - `self.skip = True` is set only inside `do_skip(cause=...)` (`dbt/task/base.py:474-476`) — verified by `grep -rn '\.skip = True'` returning exactly one site. - `do_skip(cause=cause)` is called only from `runnable.py:358-360` when `runner.node.unique_id in self._skipped_children` — i.e. exclusively on upstream-node failure. Empty, ephemeral, and selector-excluded skips don't go through `do_skip()` and don't fire these events, so they keep their existing `"skipped"` handling. dbt also emits a later `NodeFinished` with `node_status="skipped"` for the same node from the runner's `finally` block (`dbt/task/runnable.py:266`), which would otherwise overwrite the rewritten XCom back to `"skipped"`. The per-execution accumulator set ensures both events are rewritten consistently regardless of arrival order. The fix lives in the shared log parser, so it applies uniformly to both `InvocationMode.DBT_RUNNER` (protobuf events serialised via `MessageToJson`) and `InvocationMode.SUBPROCESS` (`--log-format json` stdout lines). ## Behaviour after this PR | Attempt | producer | model_a | model_flaky | model_downstream | |---|---|---|---|---| | 1 | failed | success | failed | **failed** (was SKIPPED) | | 2 (retry) | skipped (by design) | (done) | success via fallback | **success via fallback** | ## How we prevent this from regressing - **New integration test** `test_dbt_task_group_watcher_retry_recovers_skipped_downstream` in `tests/operators/test_watcher.py` runs the new DAG via `dag.test()` and asserts: - `model_downstream_run.state == "success"` (was `skipped` before the fix). - `model_downstream_run.try_number > 1` — guards against a future refactor that happens to leave the task in `success` for the wrong reason. - **New dbt project** `dev/dags/dbt/watcher_upstream_failure_recovery/` — kept separate from `watcher_downstream_not_skipped` so the two test scenarios don't conflate. Uses a distinct sequence name `_cosmos_recovery_fail_once_seq` so parallel CI runs can't share state. - **Standalone repro DAG** `dev/failed_dags/example_watcher_recovers_skipped_downstream.py` for visual verification in Airflow standalone. - **Refactor**: the fail-once-sequence cleanup is extracted into a `reset_fail_once_sequence` pytest fixture; both the new test and the existing `test_dbt_task_group_watcher_gateway_prevents_downstream_skip` now use it. ## Test plan - [x] New integration test passes on Airflow 3.2 (`tests.py3.10-3.2-1.11:test-integration`, 109.65s call duration). - [x] Existing `test_dbt_task_group_watcher_gateway_prevents_downstream_skip` still passes after the fixture refactor. - [x] Manual repro via `example_watcher_recovers_skipped_downstream` in Airflow standalone: confirmed `model_downstream_run` lands in `success` (was `skipped`). - [ ] CI passes on the full matrix. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
{args}topyproject.toml'stest-integrationscript and"$@"toscripts/test/integration.shso pytest flags afterhatch run tests.<env>:test-integrationare forwarded to pytest.hatch run tests.py3.10-3.1-1.11:test-integration -k test_name. Other flags (-x,-vv,--lf,--pdb, …) work too.Why
PYTEST_ADDOPTS="-k …"is silently ignored fortest-integrationbecausescripts/test/integration.shpasses its own-k 'not (…)'exclusion list for k8s and python-models DAGs, and pytest only honors the last-kit sees. Without{args}/"$@", the only way to run a single integration test was to inline-edit the script andgit restoreit afterwards.CI is unaffected — CI runs
test-integrationwithout extra args, and the{args}placeholder expands to an empty string.Test plan
hatch run tests.py3.10-3.1-1.11:test-integration -k <name>filters to the named test (verified locally: 1624 deselected, 1 selected).hatch run tests.<env>:test-integrationwith no extra args behaves exactly as before.hatch run docs:buildsucceeds for the docs change (the unrelateddocs/profiles/index.rsttoctree warning is local clutter from an untracked dir).🤖 Generated with Claude Code