Skip to content

Forward extra args to test-integration hatch script#2682

Merged
tatiana merged 1 commit into
mainfrom
forward-args-to-test-integration
May 15, 2026
Merged

Forward extra args to test-integration hatch script#2682
tatiana merged 1 commit into
mainfrom
forward-args-to-test-integration

Conversation

@tatiana
Copy link
Copy Markdown
Collaborator

@tatiana tatiana commented May 14, 2026

Summary

  • Add {args} to pyproject.toml's test-integration script and "$@" to scripts/test/integration.sh so pytest flags after hatch run tests.<env>:test-integration are forwarded to pytest.
  • Enables filtering to a single test or test file: hatch run tests.py3.10-3.1-1.11:test-integration -k test_name. Other flags (-x, -vv, --lf, --pdb, …) work too.
  • Document the workflow in the contributor guide.

Why

PYTEST_ADDOPTS="-k …" is silently ignored for test-integration because scripts/test/integration.sh passes its own -k 'not (…)' exclusion list for k8s and python-models DAGs, and pytest only honors the last -k it sees. Without {args}/"$@", the only way to run a single integration test was to inline-edit the script and git restore it afterwards.

CI is unaffected — CI runs test-integration without extra args, and the {args} placeholder expands to an empty string.

Test plan

  • hatch run tests.py3.10-3.1-1.11:test-integration -k <name> filters to the named test (verified locally: 1624 deselected, 1 selected).
  • hatch run tests.<env>:test-integration with no extra args behaves exactly as before.
  • hatch run docs:build succeeds for the docs change (the unrelated docs/profiles/index.rst toctree warning is local clutter from an untracked dir).

🤖 Generated with Claude Code

Allow `hatch run tests.<env>:test-integration -k <name>` (and other
pytest flags) to filter or otherwise customize the integration run
without editing scripts/test/integration.sh.

`hatch run` does not forward extra args to a script unless the script
definition contains the `{args}` placeholder. The script itself must
also accept the forwarded args via `"$@"`. Without both pieces in
place, attempts to filter with `-k` or `PYTEST_ADDOPTS` are silently
ignored -- the env-var approach in particular fails because
integration.sh already passes its own `-k` exclusion list and pytest
only honors the last `-k`.

This change adds `{args}` to the pyproject script definition, `"$@"`
to the pytest invocation in integration.sh, and documents the new
filtering workflow in the contributor guide. CI behavior is unchanged
because CI invokes the script without extra args.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tatiana tatiana requested a review from jbandoro as a code owner May 14, 2026 17:39
Copilot AI review requested due to automatic review settings May 14, 2026 17:39
@tatiana tatiana requested review from a team, corsettigyg and dwreeves as code owners May 14, 2026 17:39
@tatiana tatiana requested review from pankajastro and pankajkoti May 14, 2026 17:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables forwarding additional pytest arguments through the Hatch test-integration script, making local integration-test iteration more flexible.

Changes:

  • Adds Hatch {args} forwarding for test-integration.
  • Passes shell script positional arguments through to pytest.
  • Documents the workflow for filtering integration tests and passing pytest flags.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
scripts/test/integration.sh Forwards received script arguments to pytest.
pyproject.toml Enables Hatch to pass extra arguments into the integration test script.
docs/policy/contributing.rst Documents passing pytest flags and test selectors to integration tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.07%. Comparing base (4867faf) to head (1bad50e).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2682      +/-   ##
==========================================
+ Coverage   98.03%   98.07%   +0.03%     
==========================================
  Files         105      105              
  Lines        7843     7843              
==========================================
+ Hits         7689     7692       +3     
+ Misses        154      151       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tatiana tatiana merged commit ad5e11c into main May 15, 2026
469 of 477 checks passed
@tatiana tatiana deleted the forward-args-to-test-integration branch May 15, 2026 17:08
pankajastro added a commit that referenced this pull request May 18, 2026
tatiana added a commit that referenced this pull request May 19, 2026
Closes #2698
Closes
[BOSS-401](https://linear.app/astronomer/issue/BOSS-401/cosmos-watcher-mode-does-not-retry-downstream-models-upon-successful)
(Customer issue).

> **Note:** This PR is stacked on top of #2682 (forward extra args to
`test-integration`). The first commit on this branch is a cherry-pick of
that change — drop it on rebase once #2682 merges.

## The problem

When a watcher producer's first `dbt build` attempt fails, dbt marks the
failing model and every transitive downstream node with
`node_status="skipped"`. The producer log parser pushed those
`"skipped"` statuses straight to XCom, so each downstream consumer
sensor raised `AirflowSkipException` and ended in SKIPPED. Airflow does
not retry SKIPPED tasks — so even after the failing upstream recovered
via its own consumer-retry fallback, the downstream model was never
re-run.

The DAG completed `success` (Airflow treats SKIPPED as non-failure) with
the downstream tables silently un-materialized. Luis Carbonell (Prodege)
reported the "false green" symptom on `astronomer-cosmos==1.14.1` and
`Airflow 3.2.1+astro.2`.

Reproduction setup:
- `model_a` → succeeds.
- `model_flaky` → fails on first attempt via a Postgres sequence
pre-hook, succeeds on subsequent attempts.
- `model_downstream` → depends on `model_flaky`.

Without the fix, on retry:
- ✅ `model_flaky` recovers via the consumer's
`_fallback_to_non_watcher_run` (existing retry path for FAILED
consumers).
- ❌ `model_downstream` stays SKIPPED — Airflow never retries it.
- DAG: `success`.

## The fix

`store_dbt_resource_status_from_log` in
`cosmos/operators/_watcher/base.py` now tracks unique_ids that dbt emits
`SkippingDetails` (non-ephemeral upstream) or `LogSkipBecauseError`
(ephemeral upstream) events for. Later `"skipped"` terminal events for
those unique_ids are rewritten to `"failed"` so the downstream consumer
fails on attempt 1, Airflow retries it, and the existing
`_fallback_to_non_watcher_run` path runs the model locally once its
upstream has recovered.

Why these two events are the right discriminator (traced through dbt
source on `dbt-core==1.11`):

- `SkippingDetails` and `LogSkipBecauseError` are fired only from
`BaseRunner.on_skip()` (`dbt/task/base.py:420-471`).
- `on_skip()` is reached only when `self.skip = True`.
- `self.skip = True` is set only inside `do_skip(cause=...)`
(`dbt/task/base.py:474-476`) — verified by `grep -rn '\.skip = True'`
returning exactly one site.
- `do_skip(cause=cause)` is called only from `runnable.py:358-360` when
`runner.node.unique_id in self._skipped_children` — i.e. exclusively on
upstream-node failure.

Empty, ephemeral, and selector-excluded skips don't go through
`do_skip()` and don't fire these events, so they keep their existing
`"skipped"` handling.

dbt also emits a later `NodeFinished` with `node_status="skipped"` for
the same node from the runner's `finally` block
(`dbt/task/runnable.py:266`), which would otherwise overwrite the
rewritten XCom back to `"skipped"`. The per-execution accumulator set
ensures both events are rewritten consistently regardless of arrival
order.

The fix lives in the shared log parser, so it applies uniformly to both
`InvocationMode.DBT_RUNNER` (protobuf events serialised via
`MessageToJson`) and `InvocationMode.SUBPROCESS` (`--log-format json`
stdout lines).

## Behaviour after this PR

| Attempt | producer | model_a | model_flaky | model_downstream |
|---|---|---|---|---|
| 1 | failed | success | failed | **failed** (was SKIPPED) |
| 2 (retry) | skipped (by design) | (done) | success via fallback |
**success via fallback** |

## How we prevent this from regressing

- **New integration test**
`test_dbt_task_group_watcher_retry_recovers_skipped_downstream` in
`tests/operators/test_watcher.py` runs the new DAG via `dag.test()` and
asserts:
- `model_downstream_run.state == "success"` (was `skipped` before the
fix).
- `model_downstream_run.try_number > 1` — guards against a future
refactor that happens to leave the task in `success` for the wrong
reason.
- **New dbt project** `dev/dags/dbt/watcher_upstream_failure_recovery/`
— kept separate from `watcher_downstream_not_skipped` so the two test
scenarios don't conflate. Uses a distinct sequence name
`_cosmos_recovery_fail_once_seq` so parallel CI runs can't share state.
- **Standalone repro DAG**
`dev/failed_dags/example_watcher_recovers_skipped_downstream.py` for
visual verification in Airflow standalone.
- **Refactor**: the fail-once-sequence cleanup is extracted into a
`reset_fail_once_sequence` pytest fixture; both the new test and the
existing `test_dbt_task_group_watcher_gateway_prevents_downstream_skip`
now use it.

## Test plan

- [x] New integration test passes on Airflow 3.2
(`tests.py3.10-3.2-1.11:test-integration`, 109.65s call duration).
- [x] Existing
`test_dbt_task_group_watcher_gateway_prevents_downstream_skip` still
passes after the fixture refactor.
- [x] Manual repro via `example_watcher_recovers_skipped_downstream` in
Airflow standalone: confirmed `model_downstream_run` lands in `success`
(was `skipped`).
- [ ] CI passes on the full matrix.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tatiana tatiana added this to the Cosmos 1.15.0 milestone May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants