Skip to content

Speed up integration tests#732

Merged
tatiana merged 2 commits into
astronomer:mainfrom
jbandoro:speed-up-integration-tests
Dec 4, 2023
Merged

Speed up integration tests#732
tatiana merged 2 commits into
astronomer:mainfrom
jbandoro:speed-up-integration-tests

Conversation

@jbandoro
Copy link
Copy Markdown
Collaborator

@jbandoro jbandoro commented Dec 1, 2023

Description

I was able to speed up the integration tests by caching the dag bag result.

Previously for each parametrized dag run test, it would reparse all of the dags which takes a non-trivial amount of time to parse all of the cosmos example dags.

On my local machine the total time went from 1616s to 540s, which will save a good amount of GH minutes 😃.

Related Issue(s)

None

Breaking Change?

None

Checklist

  • I have made corresponding changes to the documentation (if required)
  • I have added tests that prove my fix is effective or that my feature works

@jbandoro jbandoro requested a review from a team as a code owner December 1, 2023 02:32
@jbandoro jbandoro requested a review from a team December 1, 2023 02:32
@netlify
Copy link
Copy Markdown

netlify Bot commented Dec 1, 2023

👷 Deploy Preview for amazing-pothos-a3bca0 processing.

Name Link
🔨 Latest commit d0851d7
🔍 Latest deploy log https://app.netlify.com/sites/amazing-pothos-a3bca0/deploys/6569f1261fad3b0008c72258

@dosubot dosubot Bot added size:XS This PR changes 0-9 lines, ignoring generated files. area:performance Related to performance, like memory usage, CPU usage, speed, etc labels Dec 1, 2023
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 1, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (aea4ee7) 92.99% compared to head (962d2a2) 92.99%.

❗ Current head 962d2a2 differs from pull request most recent head d0851d7. Consider uploading reports for the commit d0851d7 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #732   +/-   ##
=======================================
  Coverage   92.99%   92.99%           
=======================================
  Files          55       55           
  Lines        2313     2313           
=======================================
  Hits         2151     2151           
  Misses        162      162           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment thread tests/test_example_dags.py Outdated
Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing, @jbandoro , thanks a lot for the improvement!
Can't wait to merge this after you reply @joppevos' feedback :)

@tatiana tatiana merged commit a2c58f5 into astronomer:main Dec 4, 2023
tatiana added a commit that referenced this pull request Dec 4, 2023
Fix exceptions raised for some versions of Python (e.g. 2.4) after
prematurely merging #732, leading to the main branch becoming red
```
../../../.local/share/hatch/env/virtual/astronomer-cosmos/Za_bFbg4/tests.py3.8-2.5/lib/python3.8/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
    exec(co, module.__dict__)
tests/test_example_dags_no_connections.py:4: in <module>
    from functools import cache
E   ImportError: cannot import name 'cache' from 'functools'
```
@tatiana tatiana added this to the 1.3.0 milestone Dec 4, 2023
tatiana added a commit that referenced this pull request Dec 5, 2023
[Justin Bandoro](https://www.linkedin.com/in/justin-bandoro-592b14a7/)
(@jbandoro) is a Data Engineer at Kevala Inc. He's based in San
Francisco (USA) and has been an early adopter of Cosmos, using it
regularly at his company.

Not only has he been using Cosmos since the early stages, but he has
consistently improved Cosmos since January 2023:
![Screenshot 2023-12-04 at 16 28
29](https://github.com/astronomer/astronomer-cosmos/assets/272048/43197938-d1ab-431f-b101-b6026e5cd3ab)

Some of his contributions include new features, code quality,
documentation and overall improvements. Some examples:
* Speed up integration tests in 67% #732 
* Prevent override of dbt profile fields #702
* Add support for env vars in `RenderConfig` in #690 
* Use symbolic links to run local tasks, avoiding to copy potentially
huge dbt project folders in #660
* Improve documentation in #638
* Automated and improved the code complexity checks in #629
* Added `DbtDocsGCSOperator` in #616 
* Added support for Python 3.7 in #88 and #214

Additionally, he has been interacting with users in the #airflow-dbt
Slack channel in a very collaborative and supportive way.

We want to promote him as a Cosmos committer and maintainer for all
these, recognising his constant efforts and achievements towards our
community. Thank you very much, @jbandoro !
tatiana added a commit that referenced this pull request Dec 7, 2023
Features

* Add ProfileMapping for Vertica by @perttus in #540 and #688
* Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in #649
* Add support to select using (some) graph operators when using LoadMode.CUSTOM and LoadMode.DBT_MANIFEST by @tatiana in #728
* Add cosmos/propagate_logs Airflow config support for disabling log pr… by @agreenburg in #648
* Add operator_args full_refresh as a templated field by @joppevos in #623
* Expose environment variables and dbt variables in ProjectConfig by @jbandoro in #735

Enhancements

* Make Pydantic an optional dependency by @pixie79 in #736
* Create a symbolic link to dbt_packages when dbt_deps is False when using LoadMode.DBT_LS by @DanMawdsleyBA in #730
* Support no profile_config for ExecutionMode.KUBERNETES and ExecutionMode.DOCKER by @MrBones757 and @tatiana in #681 and #731
* Add aws_session_token for Athena mapping by @benjamin-awd in #663

Others

* Replace flake8 for Ruff by @joppevos in #743
* Reduce code complexity to 8 by @joppevos in #738
* Update conflict matrix between Airflow and dbt versions by @tatiana in #731
* Speed up integration tests by @jbandoro in #732
@tatiana tatiana mentioned this pull request Dec 7, 2023
tatiana added a commit that referenced this pull request Dec 7, 2023
Features

* Add ProfileMapping for Vertica by @perttus in #540 and #688
* Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in #649
* Add support to select using (some) graph operators when using LoadMode.CUSTOM and LoadMode.DBT_MANIFEST by @tatiana in #728
* Add cosmos/propagate_logs Airflow config support for disabling log pr… by @agreenburg in #648
* Add operator_args full_refresh as a templated field by @joppevos in #623
* Expose environment variables and dbt variables in ProjectConfig by @jbandoro in #735

Enhancements

* Make Pydantic an optional dependency by @pixie79 in #736
* Create a symbolic link to dbt_packages when dbt_deps is False when using LoadMode.DBT_LS by @DanMawdsleyBA in #730
* Support no profile_config for ExecutionMode.KUBERNETES and ExecutionMode.DOCKER by @MrBones757 and @tatiana in #681 and #731
* Add aws_session_token for Athena mapping by @benjamin-awd in #663

Others

* Replace flake8 for Ruff by @joppevos in #743
* Reduce code complexity to 8 by @joppevos in #738
* Update conflict matrix between Airflow and dbt versions by @tatiana in #731
* Speed up integration tests by @jbandoro in #732
jbandoro pushed a commit that referenced this pull request Dec 7, 2023
**Features**

* Add `ProfileMapping` for Snowflake encrypted private key path by
@ivanstillfront in #608
* Add support for Snowflake encrypted private key environment variable
by @DanMawdsleyBA in #649
* Add `DbtDocsGCSOperator` for uploading dbt docs to GCS by @jbandoro in
#616
* Add support to select using (some) graph operators when using
`LoadMode.CUSTOM` and `LoadMode.DBT_MANIFEST` by @tatiana in #728
* Add cosmos/propagate_logs Airflow config support for disabling log
propagation by @agreenburg in #648
* Add operator_args ``full_refresh`` as a templated field by @joppevos
in #623
* Expose environment variables and dbt variables in ``ProjectConfig`` by
@jbandoro in #735

**Enhancements**

* Make Pydantic an optional dependency by @pixie79 in #736
* Create a symbolic link to `dbt_packages` when `dbt_deps` is False when
using `LoadMode.DBT_LS` by @DanMawdsleyBA in #730
* Support no `profile_config` for `ExecutionMode.KUBERNETES` and
`ExecutionMode.DOCKER` by @MrBones757 and @tatiana in #681 and #731
* Add `aws_session_token` for Athena mapping by @benjamin-awd in #663

**Others**

* Replace flake8 for Ruff by @joppevos in #743
* Reduce code complexity to 8 by @joppevos in #738
* Update conflict matrix between Airflow and dbt versions by @tatiana in
#731
* Speed up integration tests by @jbandoro in #732
@tatiana tatiana mentioned this pull request Jan 4, 2024
tatiana added a commit that referenced this pull request Jan 4, 2024
**Features**

* Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in #733
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)).
* Add support to select using (some) graph operators when using
``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in #728
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude))
* Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in
#755,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)).
* Add ``ProfileMapping`` for Vertica by @perttus in #540, #688 and #741,
as
([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)).
* Add ``ProfileMapping`` for Snowflake encrypted private key path by
@ivanstillfront in #608, as ([documentation](
https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)).
* Add support for Snowflake encrypted private key environment variable
by @DanMawdsleyBA in #649
* Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro
in #616,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)).
* Add cosmos/propagate_logs Airflow config support for disabling log
propagation by @agreenburg in #648,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)).
* Add operator_args ``full_refresh`` as a templated field by @joppevos
in #623
* Expose environment variables and dbt variables in ``ProjectConfig`` by
@jbandoro in #735
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)).
* Support disabling event tracking when using Cosmos profile mapping by
@jbandoro in #768,
([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)).

**Enhancements**

* Make Pydantic an optional dependency by @pixie79 in #736
* Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False
when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in #730
* Add ``aws_session_token`` for Athena mapping by @benjamin-awd in #663
* Retrieve temporary credentials from ``conn_id`` for Athena by @octiva
in #758
* Extend ``DbtDocsLocalOperator`` with static flag by @joppevos  in #759

**Bug fixes**

* Remove Pydantic upper version restriction so Cosmos can be used with
Airflow 2.8 by @jlaneve in #772

**Others**

* Replace flake8 for Ruff by @joppevos in #743
* Reduce code complexity to 8 by @joppevos in #738
* Speed up integration tests by @jbandoro in #732
* Fix README quickstart link in by @RNHTTR in #776
* Add package location to work with hatchling 1.19.0 by @jbandoro in
#761
* Fix type check error in ``DbtKubernetesBaseOperator.build_env_args``
by @jbandoro in #766
* Improve ``DBT_MANIFEST`` documentation by @dwreeves in #757
* Update conflict matrix between Airflow and dbt versions by @tatiana in
#731 and #779
* pre-commit updates in #775, #770, #762
ykuc pushed a commit to ykuc/astronomer-cosmos that referenced this pull request Jan 11, 2024
**Features**

* Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in astronomer#733
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)).
* Add support to select using (some) graph operators when using
``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in astronomer#728
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude))
* Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in
astronomer#755,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)).
* Add ``ProfileMapping`` for Vertica by @perttus in astronomer#540, astronomer#688 and astronomer#741,
as
([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)).
* Add ``ProfileMapping`` for Snowflake encrypted private key path by
@ivanstillfront in astronomer#608, as ([documentation](
https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)).
* Add support for Snowflake encrypted private key environment variable
by @DanMawdsleyBA in astronomer#649
* Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro
in astronomer#616,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)).
* Add cosmos/propagate_logs Airflow config support for disabling log
propagation by @agreenburg in astronomer#648,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)).
* Add operator_args ``full_refresh`` as a templated field by @joppevos
in astronomer#623
* Expose environment variables and dbt variables in ``ProjectConfig`` by
@jbandoro in astronomer#735
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)).
* Support disabling event tracking when using Cosmos profile mapping by
@jbandoro in astronomer#768,
([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)).

**Enhancements**

* Make Pydantic an optional dependency by @pixie79 in astronomer#736
* Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False
when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in astronomer#730
* Add ``aws_session_token`` for Athena mapping by @benjamin-awd in astronomer#663
* Retrieve temporary credentials from ``conn_id`` for Athena by @octiva
in astronomer#758
* Extend ``DbtDocsLocalOperator`` with static flag by @joppevos  in astronomer#759

**Bug fixes**

* Remove Pydantic upper version restriction so Cosmos can be used with
Airflow 2.8 by @jlaneve in astronomer#772

**Others**

* Replace flake8 for Ruff by @joppevos in astronomer#743
* Reduce code complexity to 8 by @joppevos in astronomer#738
* Speed up integration tests by @jbandoro in astronomer#732
* Fix README quickstart link in by @RNHTTR in astronomer#776
* Add package location to work with hatchling 1.19.0 by @jbandoro in
astronomer#761
* Fix type check error in ``DbtKubernetesBaseOperator.build_env_args``
by @jbandoro in astronomer#766
* Improve ``DBT_MANIFEST`` documentation by @dwreeves in astronomer#757
* Update conflict matrix between Airflow and dbt versions by @tatiana in
astronomer#731 and astronomer#779
* pre-commit updates in astronomer#775, astronomer#770, astronomer#762
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
Speed up the integration tests by caching the dag bag result.

Previously, for each parametrized dag run test, it would reparse all of
the dags, which takes a non-trivial amount of time to parse all of the
cosmos example dags.

On my local machine, the total time went from 1616s to 540s, which will
save a good amount of GH minutes 😃.
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
Fix exceptions raised for some versions of Python (e.g. 2.4) after
prematurely merging astronomer#732, leading to the main branch becoming red
```
../../../.local/share/hatch/env/virtual/astronomer-cosmos/Za_bFbg4/tests.py3.8-2.5/lib/python3.8/site-packages/_pytest/assertion/rewrite.py:186: in exec_module
    exec(co, module.__dict__)
tests/test_example_dags_no_connections.py:4: in <module>
    from functools import cache
E   ImportError: cannot import name 'cache' from 'functools'
```
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
[Justin Bandoro](https://www.linkedin.com/in/justin-bandoro-592b14a7/)
(@jbandoro) is a Data Engineer at Kevala Inc. He's based in San
Francisco (USA) and has been an early adopter of Cosmos, using it
regularly at his company.

Not only has he been using Cosmos since the early stages, but he has
consistently improved Cosmos since January 2023:
![Screenshot 2023-12-04 at 16 28
29](https://github.com/astronomer/astronomer-cosmos/assets/272048/43197938-d1ab-431f-b101-b6026e5cd3ab)

Some of his contributions include new features, code quality,
documentation and overall improvements. Some examples:
* Speed up integration tests in 67% astronomer#732 
* Prevent override of dbt profile fields astronomer#702
* Add support for env vars in `RenderConfig` in astronomer#690 
* Use symbolic links to run local tasks, avoiding to copy potentially
huge dbt project folders in astronomer#660
* Improve documentation in astronomer#638
* Automated and improved the code complexity checks in astronomer#629
* Added `DbtDocsGCSOperator` in astronomer#616 
* Added support for Python 3.7 in astronomer#88 and astronomer#214

Additionally, he has been interacting with users in the #airflow-dbt
Slack channel in a very collaborative and supportive way.

We want to promote him as a Cosmos committer and maintainer for all
these, recognising his constant efforts and achievements towards our
community. Thank you very much, @jbandoro !
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
**Features**

* Add `ProfileMapping` for Snowflake encrypted private key path by
@ivanstillfront in astronomer#608
* Add support for Snowflake encrypted private key environment variable
by @DanMawdsleyBA in astronomer#649
* Add `DbtDocsGCSOperator` for uploading dbt docs to GCS by @jbandoro in
astronomer#616
* Add support to select using (some) graph operators when using
`LoadMode.CUSTOM` and `LoadMode.DBT_MANIFEST` by @tatiana in astronomer#728
* Add cosmos/propagate_logs Airflow config support for disabling log
propagation by @agreenburg in astronomer#648
* Add operator_args ``full_refresh`` as a templated field by @joppevos
in astronomer#623
* Expose environment variables and dbt variables in ``ProjectConfig`` by
@jbandoro in astronomer#735

**Enhancements**

* Make Pydantic an optional dependency by @pixie79 in astronomer#736
* Create a symbolic link to `dbt_packages` when `dbt_deps` is False when
using `LoadMode.DBT_LS` by @DanMawdsleyBA in astronomer#730
* Support no `profile_config` for `ExecutionMode.KUBERNETES` and
`ExecutionMode.DOCKER` by @MrBones757 and @tatiana in astronomer#681 and astronomer#731
* Add `aws_session_token` for Athena mapping by @benjamin-awd in astronomer#663

**Others**

* Replace flake8 for Ruff by @joppevos in astronomer#743
* Reduce code complexity to 8 by @joppevos in astronomer#738
* Update conflict matrix between Airflow and dbt versions by @tatiana in
astronomer#731
* Speed up integration tests by @jbandoro in astronomer#732
arojasb3 pushed a commit to arojasb3/astronomer-cosmos that referenced this pull request Jul 14, 2024
**Features**

* Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in astronomer#733
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)).
* Add support to select using (some) graph operators when using
``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in astronomer#728
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude))
* Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in
astronomer#755,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)).
* Add ``ProfileMapping`` for Vertica by @perttus in astronomer#540, astronomer#688 and astronomer#741,
as
([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)).
* Add ``ProfileMapping`` for Snowflake encrypted private key path by
@ivanstillfront in astronomer#608, as ([documentation](
https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)).
* Add support for Snowflake encrypted private key environment variable
by @DanMawdsleyBA in astronomer#649
* Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro
in astronomer#616,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)).
* Add cosmos/propagate_logs Airflow config support for disabling log
propagation by @agreenburg in astronomer#648,
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)).
* Add operator_args ``full_refresh`` as a templated field by @joppevos
in astronomer#623
* Expose environment variables and dbt variables in ``ProjectConfig`` by
@jbandoro in astronomer#735
([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)).
* Support disabling event tracking when using Cosmos profile mapping by
@jbandoro in astronomer#768,
([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)).

**Enhancements**

* Make Pydantic an optional dependency by @pixie79 in astronomer#736
* Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False
when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in astronomer#730
* Add ``aws_session_token`` for Athena mapping by @benjamin-awd in astronomer#663
* Retrieve temporary credentials from ``conn_id`` for Athena by @octiva
in astronomer#758
* Extend ``DbtDocsLocalOperator`` with static flag by @joppevos  in astronomer#759

**Bug fixes**

* Remove Pydantic upper version restriction so Cosmos can be used with
Airflow 2.8 by @jlaneve in astronomer#772

**Others**

* Replace flake8 for Ruff by @joppevos in astronomer#743
* Reduce code complexity to 8 by @joppevos in astronomer#738
* Speed up integration tests by @jbandoro in astronomer#732
* Fix README quickstart link in by @RNHTTR in astronomer#776
* Add package location to work with hatchling 1.19.0 by @jbandoro in
astronomer#761
* Fix type check error in ``DbtKubernetesBaseOperator.build_env_args``
by @jbandoro in astronomer#766
* Improve ``DBT_MANIFEST`` documentation by @dwreeves in astronomer#757
* Update conflict matrix between Airflow and dbt versions by @tatiana in
astronomer#731 and astronomer#779
* pre-commit updates in astronomer#775, astronomer#770, astronomer#762
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:performance Related to performance, like memory usage, CPU usage, speed, etc size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants