Skip to content

Refactor kubernetes warning callback handling#1681

Merged
pankajkoti merged 26 commits into
astronomer:mainfrom
canbekley:improve-dbt-test-warnings-kubernetes
Apr 30, 2025
Merged

Refactor kubernetes warning callback handling#1681
pankajkoti merged 26 commits into
astronomer:mainfrom
canbekley:improve-dbt-test-warnings-kubernetes

Conversation

@canbekley
Copy link
Copy Markdown
Contributor

@canbekley canbekley commented Apr 19, 2025

Description

dbt warning callbacks via kubernetes operator currently require manipulation of pod lifetime parameters that are used within the parent KubernetesPodOperator execute method and a manual pod cleanup step within the DbtWarningKubernetesOperator is created to replace parent logic. This is highly prone to errors (see issue #1628) and also difficult to maintain, as parent implementations might evolve. Instead of utilizing on_success_callback and on_failure_callback - which are usually called after pod deletion - the proposed implementation makes use of the on_pod_completion method in KubernetesPodOperatorCallback, which is called before pod teardown, and therefore can be used to scrape log results in time. This approach doesn't need to mess with pod lifetime configurations at all.

Related Issue(s)

closes #1628
closes #1579
closes #1702

Breaking Change?

none

Checklist

- [ ] I have made corresponding changes to the documentation (if required)

  • I have added tests that prove my fix is effective or that my feature works

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 19, 2025
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 19, 2025

Deploy Preview for sunny-pastelito-5ecb04 ready!

Name Link
🔨 Latest commit 7c8019f
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/6811cb8da450330008ef6e0c
😎 Deploy Preview https://deploy-preview-1681--sunny-pastelito-5ecb04.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@dosubot dosubot Bot added the execution:kubernetes Related to Kubernetes execution environment label Apr 19, 2025
@canbekley
Copy link
Copy Markdown
Contributor Author

canbekley commented Apr 19, 2025

apparently generic callbacks were added with kubernetes airflow provider version 7.14.0. Some of the tests use the provider version 7.11.0, which doesn't include this functionality yet. any recommendation how to proceed with this? can we require apache-airflow-providers-cncf-kubernetes >= 7.14.0 ?
Edit: I also see that the provider version 7.14.0 requires apache-airflow >=2.6.0.

Copy link
Copy Markdown
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @canbekley thanks a lot for your contribution!

It seems some of the CI checks are failing, would you be able to take a look into those?

We're planning to release Cosmos 1.10 by the 30th April and it would be great to have this as part of that release.

@canbekley
Copy link
Copy Markdown
Contributor Author

canbekley commented Apr 22, 2025

Hi @tatiana, they are failing because the callback implementation is relying on a kubernetes provider feature that was introduced with apache-airflow-providers-cncf-kubernetes 7.14.0. That provider version requires apache-airflow >= 2.6.0. In our integration tests, we test against apache-airflow 2.4 to 2.10.

I see these options for this mr:

  1. require apache-airflow-providers-cncf-kubernetes >= 7.14.0 (and indirectly apache-airflow >= 2.6.0) for kubernetes package dependencies and kubernetes integration tests. (started with this already, but haven't touched the integration test part yet)
  2. keeping the current implementation for warning callbacks for backward compability, switching to new implementation dynamically. (might be messy to maintain)
  3. keeping only the new implementation and discontinue support for kubernetes warning callbacks for apache-airflow-providers-cncf-kubernetes < 7.14.0. (could be quite disruptive/unexpected to users)

Maybe there are other ways too. I'd appreciate your opinion on this.

@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Apr 28, 2025
@canbekley
Copy link
Copy Markdown
Contributor Author

@pankahkoti coverage should be good now

@pankajkoti
Copy link
Copy Markdown
Contributor

hi @canbekley, thanks. Unfortunately, we have got in some conflicts in the test, would you please be able to check those and resolve? While working on adding Airflow 3 tests to be run in CI, we observed few of these tests (potentially source code) are not compatible with Airflow 3 & the corresponding provider version. Hence, we're skipped running those tests are planning to tackle them one by one to add compatibility with Airflow 3. We would like to see once we merge main into your branch, if your PR helps in un-skipping those some of the tests :)

@canbekley
Copy link
Copy Markdown
Contributor Author

canbekley commented Apr 29, 2025

regarding the failed tests: there was a weird change in the merge_context() function, which initializes a new Context when the passed context was empty. but that newly created context never left the function scope. test should be fixed now.

@pankajkoti
Copy link
Copy Markdown
Contributor

apologies @canbekley looks like we got in some conflicts after merging in main. We previously merged a fix in PR #1648 that was needed in and this is leading to conflicts in our PR here. If possible, could you please take a look at those?

@canbekley
Copy link
Copy Markdown
Contributor Author

@pankajkoti no problem! merged main back to the feature branch

Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @canbekley for cleaning up this implementation and making it easier to maintain this piece. @pankajastro and I were also able to do a do a quick sanity that the callback functionality is working well for warnings. And I believe you'd have tested it extensively on your end too since you're an active user of this execution mode.

I believe this PR also addresses #1702 which I have linked in the PR description as closes. We will go ahead with the merge and come back if there's any feedback after the release.

@pankajkoti pankajkoti merged commit bfd3662 into astronomer:main Apr 30, 2025
92 checks passed
@pankajkoti pankajkoti mentioned this pull request Apr 30, 2025
tatiana added a commit that referenced this pull request May 1, 2025
Features

* Airflow 3 support
* Support running ``dbt deps`` incrementally to pre-defined
``dbt_packages`` by @tatiana in #1668 and #1670
* Add ``DuckDB`` profile mapping by @prithvijitguha and @pankajastro in
#1553
* Implement DBT exposure selector by ghjklw #1717

Bug Fixes

* Fix ``test_indirect_selection`` flag to be propagated in case of
``TestBehavior.BUILD`` by @corsettigyg in #1663
* Fix ``select`` clause in the case of detached tests by @anyapriya in
#1680
* Operator argument fixes by @johnhoran in #1648


Airflow 3 Support

* Support rendering DbtDag in Airflow 3 by @tatiana and @ashb in #1657
* Refactor Rendered Task Instance Fields (RTIF) handling for Airflow 2.x
and 3.x by @pankajkoti in #1661
* Run cosmos operator in Airflow 3 by @pankajastro in #1642
* Fix ``python_virtualenv.prepare_env`` top-level import for Airflow 3
by @pankajkoti in #1678
* Fix Variable not found issue in Airflow 3 by @tatiana in #1684
* Disable CosmosPlugin on Airflow 3 setup by @pankajkoti in #1692, #1698
* Use ``schedule`` param in example DAGs instead of the 2.10 deprecated
and 3.0 removed ``schedule_interval`` by @pankajkoti in #1701
* Ensure ``virtualenv_dir`` path exists by @pankajkoti in #1724
* Support emitting Assets with Airflow 3 by @tatiana in #1713
* Add docs on Airflow 3 compatibility by @pankajkoti and @tatiana in
#1731
* Introduce, test and document asset/dataset breaking change by @tatiana
in #1672
* Improve dataset/asset driven scheduling documentation by @tatiana in
#1729

Enhancements

* Allow multiple callbacks by @corsettigyg #1693
* Refactor kubernetes warning callback handling by @canbekley in #1681

Documentation

* Add documentation related to ``copy_dbt_packages`` by @tatiana in
#1671
* Make wording and command consistent in the contributing doc by
@pankajkoti in #1697
* Add MonteCarlo callback example for importing dbt artifacts by
@corsettigyg #1695
* Change async feature to be non-experimental by @tatiana in #1732

Others

* Add sample ``dbt_packages`` to validate incremental ``dbt deps`` by
@tatiana in #1669
* Add kubernetes execution mode example in Airflow 3 by @pankajastro in
#1667
* Check only major version until Airflow 3 stable release by
@pankajastro in #1665
* Install Airflow from main branch by @pankajastro in #1660
* Add dev tool for Airflow 3 by @pankajastro and @tatiana in #1627
* Improve Airflow 3 tooling by @pankajastro in #1656
* Skip associating ``openlineage_events_completes`` to ``ti`` in Airflow
3 by @pankajkoti in #1662
* Add .gitignore file for the scripts/airflow3 directory by @pankajkoti
in #1658
* Remove ``original_jaffle_shop`` dbt project by @pankajkoti in #1676
* Fix or ignore type check error by @pankajastro in #1687
* Run virtualenv example with Airflow 3 tooling by @pankajastro in #1686
* Enable running setup/teardown tasks with Async execution DAG with
Airflow 3 tooling by @pankajastro in #1696
* Enable integration tests for the DuckDB adapter by @pankajastro in
#1699
* Add Airflow 3 tests matrix entries in CI by @pankajkoti in #1646
* Use a different way to get tasks count for asserting test_perf_dag by
@pankajkoti in #1714
* Reinstall Airflow 3 dependency on ``pydantic>=2.11`` for dbt adapter
versions 1.6 & 1.9 by @pankajkoti in #1715
* Fix outdated ``echo`` in Airflow 3 tooling script #1700
* Add files not needed for git tracking to .gitignore by @pankajkoti in
#1723
* Use latest minor versions for dbt adapters to get in compatibility
fixes by @pankajkoti in #1719
* Fix Airflow 3 tests raising generate_run_id() takes 0 positional
arguments by @tatiana in #1725
* Fix dataset tests failing in Airflow 3 by @tatiana in #1716
* Enable example DAGs to run in CI that were disabled in PR #1646 by
@pankajkoti in #1726
* Pre-commit updates: #1666, #1653, #1641, #1682, #1720


Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Singh
<98807258+pankajastro@users.noreply.github.com>

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

execution:kubernetes Related to Kubernetes execution environment size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

4 participants