Reduce integration test CI time from ~30 min to ~16 min by pankajkoti · Pull Request #2562 · astronomer/astronomer-cosmos

pankajkoti · 2026-04-16T06:24:49Z

Summary

Use pytest-split to distribute integration tests into 3 groups that run as separate GitHub Actions matrix jobs. Each group gets its own Postgres container, so there are no shared-state conflicts.

Changes:

Add split-group: [1, 2, 3] dimension to Run-Integration-Tests matrix
Pass PYTEST_SPLITS/PYTEST_SPLIT_GROUP env vars through to pytest
Update coverage artifact names to include split group
Add .test_durations file with real timings from CI (184 tests, balanced ~390s per group)
integration.sh conditionally adds --splits/--group flags (no-op when env vars are unset, preserving local dev behavior)

Results (bottleneck job wall-clock):

Before splitting	2-way split	3-way split (this PR)
~30 min (Airflow 3.1)	~22 min	~16 min

How it works:

pytest-split reads .test_durations and uses the least_duration algorithm to bin-pack tests into balanced groups
Each matrix job gets its own GitHub Actions runner and Postgres service container — no shared state
New tests not in .test_durations get assigned to the lightest group automatically
The file can be refreshed with real timings via pytest --store-durations periodically or when we see the splits are not balanced and some of them are taking longer

closes: #2302
related: #2547

Use pytest-split to distribute integration tests into 2 groups that run as separate GitHub Actions matrix jobs. Each group gets its own Postgres container, so there are no shared-state conflicts. Changes: - Add split-group [1, 2] dimension to Run-Integration-Tests matrix - Pass PYTEST_SPLITS/PYTEST_SPLIT_GROUP env vars through to pytest - Update coverage artifact names to include split group - Add .test_durations file with uniform weights for bootstrapping - integration.sh conditionally adds --splits/--group flags (no-op when env vars are unset, preserving local dev behavior) This roughly halves wall-clock time per Airflow version by running ~half the tests in each parallel job. The .test_durations file can be refreshed with real timings via --store-durations after the first CI run. Related: #2302 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add speedup-integration-tests-run to the push trigger so the workflow changes (pytest-split matrix) are picked up from this branch instead of main. Must be removed before merging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR reduces integration test wall-clock time in CI by splitting the integration pytest run into two parallel GitHub Actions matrix jobs using pytest-split, while keeping local developer runs unchanged when split env vars are not set.

Changes:

Add a split-group: [1, 2] dimension to the Run-Integration-Tests workflow matrix and pass PYTEST_SPLITS / PYTEST_SPLIT_GROUP into the test step.
Update integration coverage artifact names to include the split group so both halves can be uploaded and later combined.
Add a bootstrap .test_durations file and update scripts/test/integration.sh to conditionally add pytest-split CLI args when running in CI.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
`scripts/test/integration.sh`	Conditionally appends `pytest-split` arguments based on CI env vars to distribute tests across parallel jobs.
`.test_durations`	Adds an initial durations map (uniform weights) for `pytest-split` to use in least-duration splitting.
`.github/workflows/test.yml`	Expands the integration test matrix to run two parallel split groups and updates coverage artifact naming accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-04-16T06:51:59Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.04%. Comparing base (f4fb470) to head (4db8cd2).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2562   +/-   ##
=======================================
  Coverage   98.04%   98.04%           
=======================================
  Files         103      103           
  Lines        7586     7586           
=======================================
  Hits         7438     7438           
  Misses        148      148

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Update .test_durations with actual timings averaged from CI run 24495658555, replacing the uniform 1.0 bootstrap weights. The file now has 184 tests with real durations, achieving a balanced split across three groups (~390s each). Increase split-group from [1, 2] to [1, 2, 3] to further reduce wall-clock time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pankajkoti · 2026-04-16T07:25:04Z

Push triggered run results on the branch taking approx ~16 min: https://github.com/astronomer/astronomer-cosmos/actions/runs/24496792084?pr=2562

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pankajastro

Looks good. I was just wondering if it makes sense to document how to refresh the .test_durations file in the contributing docs.

tatiana

@pankajkoti Amazing results, @pankajkoti ! Really happy how you were able to get from the original 1h to 16min.

Two minor feedback:

Would it be worth changing the PR title to represent the ultimate goal of this work (reduce integration tests from X to Y, or by Z%) - instead of the how?

WDYT of creating a follow-up ticket to automate the generation of .test_durations?

pankajkoti · 2026-04-16T12:46:14Z

Thanks for the reviews @pankajastro and @tatiana. I’ve created a follow-up ticket to automate updates to the .test_durations file (I think this shouldn’t become urgent unless the distribution becomes noticeably imbalanced as we add more integration tests).

The `pre_condition` task group in `cosmos_manifest_selectors_example` used `select=["+customers"]`, which left the DAG dependent on state leaked from other tests. This made the integration test `test_example_dag[cosmos_manifest_selectors_example]` flaky; passing only when `pytest-split` happened to order another jaffle_shop DAG before it in the same split (which pre-populated the required tables in Postgres). **Root cause** Two gaps in the +customers selection: 1. Orphan seeds. In `altered_jaffle_shop`, `stg_orders` and `stg_payments` read their data via `source('postgres_db', 'raw_orders' | 'raw_payments')`. The corresponding seeds (`raw_orders`, `raw_payments`) are orphans in the manifest, nothing references them, so `+` traversal skips them, and they never get loaded. `raw_customers` is pulled in because each staging model has a `force_seed_dep CTE that does select * from {{ ref('raw_customers') }}`. 2. Missing `orders` model. The `relationships_orders_customer_id__customer_id__ref_customers_ test` is attached to both `customers` and `orders` and queries `public.orders`. `+customers` pulls the test in (it's a child of customers) but doesn't build the orders model, so `customers.test` fails with `relation public.orders" does not exist`. This also matters because the downstream `local_example` / `aws_s3_example` / `gcp_gs_example` / `azure_abfs_example` task groups all run the critical_path selector, which is the union of `customers` and `orders`, so pre_condition needs to leave both models present. **Fix** Change the pre_condition selector to: `select=["+customers", "+orders", "raw_orders", "raw_payments"]` - `+customers` / `+orders` build both final models and their upstream `stg_*` models (and pull in `raw_customers` via `ref`) - `raw_orders`, `raw_payments` explicitly seed the two orphan seeds so the `source()` reads in `stg_orders` / `stg_payments` resolve related: #2562 related: #2592 --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pankajkoti temporarily deployed to internal April 16, 2026 06:24 — with GitHub Actions Inactive

pankajkoti requested a review from Copilot April 16, 2026 06:26

Copilot started reviewing on behalf of pankajkoti April 16, 2026 06:26 View session

pankajkoti temporarily deployed to internal April 16, 2026 06:28 — with GitHub Actions Inactive

pankajkoti temporarily deployed to internal April 16, 2026 06:29 — with GitHub Actions Inactive

Copilot AI reviewed Apr 16, 2026

View reviewed changes

pankajkoti temporarily deployed to internal April 16, 2026 07:02 — with GitHub Actions Inactive

pankajkoti commented Apr 16, 2026

View reviewed changes

Comment thread .github/workflows/test.yml Outdated

Apply suggestion from @pankajkoti

4db8cd2

pankajkoti temporarily deployed to internal April 16, 2026 07:26 — with GitHub Actions Inactive

pankajkoti marked this pull request as ready for review April 16, 2026 07:28

pankajkoti requested review from a team, corsettigyg, dwreeves and jbandoro as code owners April 16, 2026 07:28

pankajkoti requested review from Copilot, pankajastro and tatiana April 16, 2026 07:28

Copilot started reviewing on behalf of pankajkoti April 16, 2026 07:29 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

Comment thread scripts/test/integration.sh

pankajastro approved these changes Apr 16, 2026

View reviewed changes

tatiana approved these changes Apr 16, 2026

View reviewed changes

pankajkoti changed the title ~~Split integration tests across parallel CI jobs via pytest-split~~ Reduce integration test CI time from ~30 min to ~16 min Apr 16, 2026

pankajkoti mentioned this pull request Apr 16, 2026

[CI] Automate .test_durations refresh for pytest-split #2566

Open

pankajkoti merged commit aa4c770 into main Apr 16, 2026
89 checks passed

pankajkoti deleted the speedup-integration-tests-run branch April 16, 2026 12:46

pankajkoti mentioned this pull request Apr 22, 2026

Fix flaky cosmos_manifest_selectors_example DAG in CI #2593

Merged

tatiana added this to the Cosmos 1.15.0 milestone May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce integration test CI time from ~30 min to ~16 min#2562

Reduce integration test CI time from ~30 min to ~16 min#2562
pankajkoti merged 4 commits into
mainfrom
speedup-integration-tests-run

pankajkoti commented Apr 16, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

codecov Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

pankajkoti commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

pankajastro left a comment

Uh oh!

tatiana left a comment

Uh oh!

pankajkoti commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

pankajkoti commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

codecov Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pankajkoti commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

pankajastro left a comment

Choose a reason for hiding this comment

Uh oh!

tatiana left a comment

Choose a reason for hiding this comment

Uh oh!

pankajkoti commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pankajkoti commented Apr 16, 2026 •

edited

Loading

codecov Bot commented Apr 16, 2026 •

edited

Loading

pankajkoti commented Apr 16, 2026 •

edited

Loading