Refactor kubernetes warning callback handling#1681
Conversation
✅ Deploy Preview for sunny-pastelito-5ecb04 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
|
apparently generic callbacks were added with kubernetes airflow provider version 7.14.0. Some of the tests use the provider version 7.11.0, which doesn't include this functionality yet. any recommendation how to proceed with this? can we require |
tatiana
left a comment
There was a problem hiding this comment.
Hi @canbekley thanks a lot for your contribution!
It seems some of the CI checks are failing, would you be able to take a look into those?
We're planning to release Cosmos 1.10 by the 30th April and it would be great to have this as part of that release.
|
Hi @tatiana, they are failing because the callback implementation is relying on a kubernetes provider feature that was introduced with apache-airflow-providers-cncf-kubernetes 7.14.0. That provider version requires apache-airflow >= 2.6.0. In our integration tests, we test against apache-airflow 2.4 to 2.10. I see these options for this mr:
Maybe there are other ways too. I'd appreciate your opinion on this. |
|
@pankahkoti coverage should be good now |
|
hi @canbekley, thanks. Unfortunately, we have got in some conflicts in the test, would you please be able to check those and resolve? While working on adding Airflow 3 tests to be run in CI, we observed few of these tests (potentially source code) are not compatible with Airflow 3 & the corresponding provider version. Hence, we're skipped running those tests are planning to tackle them one by one to add compatibility with Airflow 3. We would like to see once we merge main into your branch, if your PR helps in un-skipping those some of the tests :) |
|
regarding the failed tests: there was a weird change in the merge_context() function, which initializes a new Context when the passed context was empty. but that newly created context never left the function scope. test should be fixed now. |
|
apologies @canbekley looks like we got in some conflicts after merging in main. We previously merged a fix in PR #1648 that was needed in and this is leading to conflicts in our PR here. If possible, could you please take a look at those? |
|
@pankajkoti no problem! merged main back to the feature branch |
pankajkoti
left a comment
There was a problem hiding this comment.
Thanks a lot @canbekley for cleaning up this implementation and making it easier to maintain this piece. @pankajastro and I were also able to do a do a quick sanity that the callback functionality is working well for warnings. And I believe you'd have tested it extensively on your end too since you're an active user of this execution mode.
I believe this PR also addresses #1702 which I have linked in the PR description as closes. We will go ahead with the merge and come back if there's any feedback after the release.
Features * Airflow 3 support * Support running ``dbt deps`` incrementally to pre-defined ``dbt_packages`` by @tatiana in #1668 and #1670 * Add ``DuckDB`` profile mapping by @prithvijitguha and @pankajastro in #1553 * Implement DBT exposure selector by ghjklw #1717 Bug Fixes * Fix ``test_indirect_selection`` flag to be propagated in case of ``TestBehavior.BUILD`` by @corsettigyg in #1663 * Fix ``select`` clause in the case of detached tests by @anyapriya in #1680 * Operator argument fixes by @johnhoran in #1648 Airflow 3 Support * Support rendering DbtDag in Airflow 3 by @tatiana and @ashb in #1657 * Refactor Rendered Task Instance Fields (RTIF) handling for Airflow 2.x and 3.x by @pankajkoti in #1661 * Run cosmos operator in Airflow 3 by @pankajastro in #1642 * Fix ``python_virtualenv.prepare_env`` top-level import for Airflow 3 by @pankajkoti in #1678 * Fix Variable not found issue in Airflow 3 by @tatiana in #1684 * Disable CosmosPlugin on Airflow 3 setup by @pankajkoti in #1692, #1698 * Use ``schedule`` param in example DAGs instead of the 2.10 deprecated and 3.0 removed ``schedule_interval`` by @pankajkoti in #1701 * Ensure ``virtualenv_dir`` path exists by @pankajkoti in #1724 * Support emitting Assets with Airflow 3 by @tatiana in #1713 * Add docs on Airflow 3 compatibility by @pankajkoti and @tatiana in #1731 * Introduce, test and document asset/dataset breaking change by @tatiana in #1672 * Improve dataset/asset driven scheduling documentation by @tatiana in #1729 Enhancements * Allow multiple callbacks by @corsettigyg #1693 * Refactor kubernetes warning callback handling by @canbekley in #1681 Documentation * Add documentation related to ``copy_dbt_packages`` by @tatiana in #1671 * Make wording and command consistent in the contributing doc by @pankajkoti in #1697 * Add MonteCarlo callback example for importing dbt artifacts by @corsettigyg #1695 * Change async feature to be non-experimental by @tatiana in #1732 Others * Add sample ``dbt_packages`` to validate incremental ``dbt deps`` by @tatiana in #1669 * Add kubernetes execution mode example in Airflow 3 by @pankajastro in #1667 * Check only major version until Airflow 3 stable release by @pankajastro in #1665 * Install Airflow from main branch by @pankajastro in #1660 * Add dev tool for Airflow 3 by @pankajastro and @tatiana in #1627 * Improve Airflow 3 tooling by @pankajastro in #1656 * Skip associating ``openlineage_events_completes`` to ``ti`` in Airflow 3 by @pankajkoti in #1662 * Add .gitignore file for the scripts/airflow3 directory by @pankajkoti in #1658 * Remove ``original_jaffle_shop`` dbt project by @pankajkoti in #1676 * Fix or ignore type check error by @pankajastro in #1687 * Run virtualenv example with Airflow 3 tooling by @pankajastro in #1686 * Enable running setup/teardown tasks with Async execution DAG with Airflow 3 tooling by @pankajastro in #1696 * Enable integration tests for the DuckDB adapter by @pankajastro in #1699 * Add Airflow 3 tests matrix entries in CI by @pankajkoti in #1646 * Use a different way to get tasks count for asserting test_perf_dag by @pankajkoti in #1714 * Reinstall Airflow 3 dependency on ``pydantic>=2.11`` for dbt adapter versions 1.6 & 1.9 by @pankajkoti in #1715 * Fix outdated ``echo`` in Airflow 3 tooling script #1700 * Add files not needed for git tracking to .gitignore by @pankajkoti in #1723 * Use latest minor versions for dbt adapters to get in compatibility fixes by @pankajkoti in #1719 * Fix Airflow 3 tests raising generate_run_id() takes 0 positional arguments by @tatiana in #1725 * Fix dataset tests failing in Airflow 3 by @tatiana in #1716 * Enable example DAGs to run in CI that were disabled in PR #1646 by @pankajkoti in #1726 * Pre-commit updates: #1666, #1653, #1641, #1682, #1720 Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com> Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com> --------- Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Description
dbt warning callbacks via kubernetes operator currently require manipulation of pod lifetime parameters that are used within the parent
KubernetesPodOperatorexecute method and a manual pod cleanup step within theDbtWarningKubernetesOperatoris created to replace parent logic. This is highly prone to errors (see issue #1628) and also difficult to maintain, as parent implementations might evolve. Instead of utilizingon_success_callbackandon_failure_callback- which are usually called after pod deletion - the proposed implementation makes use of the on_pod_completion method inKubernetesPodOperatorCallback, which is called before pod teardown, and therefore can be used to scrape log results in time. This approach doesn't need to mess with pod lifetime configurations at all.Related Issue(s)
closes #1628
closes #1579
closes #1702
Breaking Change?
none
Checklist
- [ ] I have made corresponding changes to the documentation (if required)