Add new parsing method LoadMode.DBT_LS_FILE#733
Conversation
👷 Deploy Preview for amazing-pothos-a3bca0 processing.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #733 +/- ##
==========================================
+ Coverage 93.18% 93.23% +0.04%
==========================================
Files 55 55
Lines 2451 2469 +18
==========================================
+ Hits 2284 2302 +18
Misses 167 167 ☔ View full report in Codecov by Sentry. |
tatiana
left a comment
There was a problem hiding this comment.
Hi @woogakoki, thanks a lot for the contribution. This looks great!
Your change will help with the performance for users who like the precision of the DBT_LS load method but don't want to have the pay the cost of running it during DAG parsing/task execution.
Three main feedback points:
- Please, could you address the pre-commit errors
- The coverage checks are also failing, if you could increase the test coverage by addressing the PR comments created by Codecov, it would be amazing!
- Please, could you change one of the existing example DAGs in
dev/dagsto use this parsing method? It could perhaps be this one: https://github.com/astronomer/astronomer-cosmos/blob/main/dev/dags/user_defined_profile.py
I left some minor comments inline.
LoadMode.DBT_LS_FILE
|
Hey @woogakoki, we're super close to getting this merged! |
Hi @tatiana, thanks for the heads up. Will try to fix it over the course of today then! |
Unfortunatly not as I only tested it locally |
|
@tatiana The Type-Check test is failing due to some issue in cosmos/operators/kubernetes.py. Is that known and can be ignored? |
|
@woogakoki Thanks for addressing the feedback. Yes, the issue with the type check was solved by @jbandoro in #766. Once the checks pass, I'll merge your PR! |
|
@woogakoki when you get a chance, please, check the failing tests: Also, once you rebase, we should see the type check passing 🤞 |
…into feat/add_new_loadmode
|
Amazing work, @woogakoki , thank you very much for getting this work through |
Thank you for the support @tatiana! Super happy it went through this year already. |
**Features** * Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in #733 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)). * Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in #728 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude)) * Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in #755, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)). * Add ``ProfileMapping`` for Vertica by @perttus in #540, #688 and #741, as ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)). * Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in #608, as ([documentation]( https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)). * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in #649 * Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in #616, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)). * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in #648, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)). * Add operator_args ``full_refresh`` as a templated field by @joppevos in #623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in #735 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)). * Support disabling event tracking when using Cosmos profile mapping by @jbandoro in #768, ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)). **Enhancements** * Make Pydantic an optional dependency by @pixie79 in #736 * Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in #730 * Add ``aws_session_token`` for Athena mapping by @benjamin-awd in #663 * Retrieve temporary credentials from ``conn_id`` for Athena by @octiva in #758 * Extend ``DbtDocsLocalOperator`` with static flag by @joppevos in #759 **Bug fixes** * Remove Pydantic upper version restriction so Cosmos can be used with Airflow 2.8 by @jlaneve in #772 **Others** * Replace flake8 for Ruff by @joppevos in #743 * Reduce code complexity to 8 by @joppevos in #738 * Speed up integration tests by @jbandoro in #732 * Fix README quickstart link in by @RNHTTR in #776 * Add package location to work with hatchling 1.19.0 by @jbandoro in #761 * Fix type check error in ``DbtKubernetesBaseOperator.build_env_args`` by @jbandoro in #766 * Improve ``DBT_MANIFEST`` documentation by @dwreeves in #757 * Update conflict matrix between Airflow and dbt versions by @tatiana in #731 and #779 * pre-commit updates in #775, #770, #762
**Features** * Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in astronomer#733 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)). * Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in astronomer#728 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude)) * Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in astronomer#755, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)). * Add ``ProfileMapping`` for Vertica by @perttus in astronomer#540, astronomer#688 and astronomer#741, as ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)). * Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in astronomer#608, as ([documentation]( https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)). * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in astronomer#616, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)). * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in astronomer#648, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)). * Add operator_args ``full_refresh`` as a templated field by @joppevos in astronomer#623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in astronomer#735 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)). * Support disabling event tracking when using Cosmos profile mapping by @jbandoro in astronomer#768, ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)). **Enhancements** * Make Pydantic an optional dependency by @pixie79 in astronomer#736 * Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in astronomer#730 * Add ``aws_session_token`` for Athena mapping by @benjamin-awd in astronomer#663 * Retrieve temporary credentials from ``conn_id`` for Athena by @octiva in astronomer#758 * Extend ``DbtDocsLocalOperator`` with static flag by @joppevos in astronomer#759 **Bug fixes** * Remove Pydantic upper version restriction so Cosmos can be used with Airflow 2.8 by @jlaneve in astronomer#772 **Others** * Replace flake8 for Ruff by @joppevos in astronomer#743 * Reduce code complexity to 8 by @joppevos in astronomer#738 * Speed up integration tests by @jbandoro in astronomer#732 * Fix README quickstart link in by @RNHTTR in astronomer#776 * Add package location to work with hatchling 1.19.0 by @jbandoro in astronomer#761 * Fix type check error in ``DbtKubernetesBaseOperator.build_env_args`` by @jbandoro in astronomer#766 * Improve ``DBT_MANIFEST`` documentation by @dwreeves in astronomer#757 * Update conflict matrix between Airflow and dbt versions by @tatiana in astronomer#731 and astronomer#779 * pre-commit updates in astronomer#775, astronomer#770, astronomer#762
Adds a new load method, which is in between manifest and dbt_ls. This aims to have a dbt ls file with the output prepared and only use the parsing logic of the dbt_ls load method. This should increase performance compared to using dbt_ls. However, this method needs one output file for each taskgroup build.
**Features** * Add new parsing method ``LoadMode.DBT_LS_FILE`` by @woogakoki in astronomer#733 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/parsing-methods.html#dbt-ls-file)). * Add support to select using (some) graph operators when using ``LoadMode.CUSTOM`` and ``LoadMode.DBT_MANIFEST`` by @tatiana in astronomer#728 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/selecting-excluding.html#using-select-and-exclude)) * Add support for dbt ``selector`` arg for DAG parsing by @jbandoro in astronomer#755, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/render-config.html#render-config)). * Add ``ProfileMapping`` for Vertica by @perttus in astronomer#540, astronomer#688 and astronomer#741, as ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/VerticaUserPassword.html)). * Add ``ProfileMapping`` for Snowflake encrypted private key path by @ivanstillfront in astronomer#608, as ([documentation]( https://astronomer.github.io/astronomer-cosmos/profiles/SnowflakeEncryptedPrivateKeyFilePem.html)). * Add support for Snowflake encrypted private key environment variable by @DanMawdsleyBA in astronomer#649 * Add ``DbtDocsGCSOperator`` for uploading dbt docs to GCS by @jbandoro in astronomer#616, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/generating-docs.html#upload-to-gcs)). * Add cosmos/propagate_logs Airflow config support for disabling log propagation by @agreenburg in astronomer#648, ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/logging.html)). * Add operator_args ``full_refresh`` as a templated field by @joppevos in astronomer#623 * Expose environment variables and dbt variables in ``ProjectConfig`` by @jbandoro in astronomer#735 ([documentation](https://astronomer.github.io/astronomer-cosmos/configuration/project-config.html#project-config-example)). * Support disabling event tracking when using Cosmos profile mapping by @jbandoro in astronomer#768, ([documentation](https://astronomer.github.io/astronomer-cosmos/profiles/index.html#disabling-dbt-event-tracking)). **Enhancements** * Make Pydantic an optional dependency by @pixie79 in astronomer#736 * Create a symbolic link to ``dbt_packages`` when ``dbt_deps`` is False when using ``LoadMode.DBT_LS`` by @DanMawdsleyBA in astronomer#730 * Add ``aws_session_token`` for Athena mapping by @benjamin-awd in astronomer#663 * Retrieve temporary credentials from ``conn_id`` for Athena by @octiva in astronomer#758 * Extend ``DbtDocsLocalOperator`` with static flag by @joppevos in astronomer#759 **Bug fixes** * Remove Pydantic upper version restriction so Cosmos can be used with Airflow 2.8 by @jlaneve in astronomer#772 **Others** * Replace flake8 for Ruff by @joppevos in astronomer#743 * Reduce code complexity to 8 by @joppevos in astronomer#738 * Speed up integration tests by @jbandoro in astronomer#732 * Fix README quickstart link in by @RNHTTR in astronomer#776 * Add package location to work with hatchling 1.19.0 by @jbandoro in astronomer#761 * Fix type check error in ``DbtKubernetesBaseOperator.build_env_args`` by @jbandoro in astronomer#766 * Improve ``DBT_MANIFEST`` documentation by @dwreeves in astronomer#757 * Update conflict matrix between Airflow and dbt versions by @tatiana in astronomer#731 and astronomer#779 * pre-commit updates in astronomer#775, astronomer#770, astronomer#762
Description
Adds a new load method which is in between manifest and dbt_ls. Goal of this is to have a dbt ls file with the output prepared and only use the parsing logic of dbt_ls load method. This should increase performance compared to using dbt_ls.
However this method needs one output file for each taskgroup build.
Breaking Change?
No breaking changes, it just adds a new load method on top but leaves the rest as is.
Checklist