Kubernetes Pod Operator convert container_resources#1821
Conversation
✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.
|
There was a problem hiding this comment.
Pull Request Overview
This pull request addresses the issue with container resource specifications by converting a dictionary input into a V1ResourceRequirements object so that the resource limits are correctly applied in Astronomer hosted deployments.
- Added a conversion step for container_resources when provided as a dict.
- Ensured compatibility with the expected type in KubernetesPodOperator.
tatiana
left a comment
There was a problem hiding this comment.
Hi @johnhoran thank you very much for raising this and creating the PR so things work in Astro. Which type of executor are you using? Where is the k8s cluster you're instantiating these pods?
I'm following up with the Astro Airflow Infra team to gain a deeper understanding of why this override occurs and if there are any alternative solutions.
That said, I'm happy with the change you're proposing. Please, could you create unit tests for this use case, to validate the desired behaviour?
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1821 +/- ##
=======================================
Coverage 98.01% 98.01%
=======================================
Files 85 85
Lines 5287 5290 +3
=======================================
+ Hits 5182 5185 +3
Misses 105 105 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
container_resources
Hi @tatiana, We are running in astronomer 13 in a dedicated hosted AWS cluster using the kubernetes executor. |
|
@johnhoran I discussed this matter with fellow Astronomers, and @kaxil raised this question:
This may justify the behaviour you're seeing |
As far as I know it isn't possible to create a deployment without the defaults set. I guess the real question is if this behaviour is intentional or not, like I said, it is arguably correct. |
## Description Slightly odd and I'm not sure if this is the best solution, but this is the code base I have access to, so lets go with it. I'm trying to specify resource limits in cosmos using https://astronomer.github.io/astronomer-cosmos/configuration/operator-args.html#overriding-operator-arguments-per-dbt-node-or-group-of-nodes, specifically I'm adding ```yaml meta: cosmos: operator_kwargs: container_resources: limits: memory: "2000Mi" ``` to a DBT project and when running locally using `astro dev start` this works correctly. However when running in astronomer hosted, it doesn't apply. Researching further if I have an operator with ``` KubernetesPodOperator( ... container_resources={ "limits": {"cpu": "100m", "memory": "100Mi", "ephemeral-storage": "512Mi"}, "requests": { "cpu": "100m", "memory": "100Mi", "ephemeral-storage": "512Mi", }, }, ) ``` then the resources will be overridden with the defaults set on the deployment in astronomer hosted, though running locally it works correctly. If I specify them as ``` KubernetesPodOperator( ... container_resources=k8s.V1ResourceRequirements( limits={"cpu": "100m", "memory": "100Mi", "ephemeral-storage": "512Mi"}, requests={"cpu": "100m", "memory": "100Mi", "ephemeral-storage": "512Mi"}, ), ) ``` then it all works correctly, but this precludes my ability to set them in cosmos as the argument will always be passed as a dictionary there. It seems whatever code astronomer is using on the backend to apply the default container resources doesn't work with anything except `V1ResourceRequirements`. Arguably this is correct behaviour since the signature in the kubernetes pod operator specifies the type to be `V1ResourceRequirements | None`, even though the code itself does work with dictionaries. Anyway the proposed solution is to convert this argument to the correct type before it is passed to the kubernetes pod operator. Thoughts? --------- Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit 389e1bf)
**Bug Fixes** * Fix task instance ``try_number`` attribute for Airflow 3 compatibility by @pankajkoti in #1781 * Fix rendered template override logic when ``should_store_compiled_sql=False`` to restore pre-refactor behaviour by @pankajkoti in #1777 * Fix ``ProfileConfig`` in GCP Cloud Run job execution mode by @ramonvermeulen in #1783 * Fix dbt Docs page height by @1cadumagalhaes in #1793 * Add support to base64 encoded pem in Snowflake profiles by @brunocmartins in #1801 * Allow to disable owner inheritance from dbt into airflow DAG owners by @CorsettiS in #1787 * Fix Kubernetes Pod Operator conversion of ``container_resources`` to ``resources`` by @johnhoran in #1821 * Fix ``dbt deps`` with project level variables by @AlexandrKhabarov in #1822 * Fix source freshness warnings in kubernetes execution mode by @Pawel-Drabczyk in #1859 * Fix: Harden DbtNode against null config/meta by @pankajkoti in #1877 * Fix cache behaviour when DAG name contains "." by @tatiana in #1908 **Documentation** * Fix ``contributing.rst`` docs by @tatiana in #1785 * Fix docs rendering in Airflow 3 Compatibility by @pankajastro in #1790 * Fix typo in ``selecting-excluding.rst`` by @msshroff in #1814 * Update testing behavior file with ``ExecutionMode.KUBERNETES`` by @LuigiCerone in #1813 * Add step to fork repo in contributing guide by @pankajastro in #1808 * Fix ``depends_on`` attribute by @benedikt-buchert in #1837 * Fix character name by @ThePsyjo in #1860 * Update suggested MWAA startup script by @jaklan in #1884 * Make implementation & docs consistent regarding ``use_dataset_airflow3_uri_standard`` by @Anti0ff in #1878 **Others** * Set retries to 0 in example DAGs by @pankajkoti in #1782 * Fix ``test_async_example_dag_without_setup_task`` tests by @pankajastro in #1788 * Fix test hash value for Darwin when using Py 3.12.10 by @tatiana in #1786 * Upgrade Python and Airflow used to run MyPy checks by @tatiana in #1796 * Assert example DAGs' ``DagRunState`` and fix issues by @pankajkoti and @tatiana in #1778 * Update the conflict matrix to include AF 2.10, 2.11 & 3.0 and dbt 1.9 & 1.10 by @tatiana in #1820 * Fix broken CI due to Pydantic conflicts by @tatiana in #1809 * Drop Python 3.8 Support by @pankajastro in #1852 * Add Airflow 2.11 to the test matrix by @tatiana in #1807 * Require Authorize for all jobs on pull requests from external contributors in CI by @pankajkoti in #1861 * Leverage Trusted Publisher Management when publishing PyPI package by @tatiana in #1862 * CI: Add back accidentally deleted python-version matrix for running unit tests by @pankajkoti in #1872 * Remove commented code and fix mypy failures by @pankajkoti in #1876 * Add Zizmor analysis GitHub action by @pankajkoti in #1870 * Catch FlushError on Datasets for Airflow 2.11 dags test by @pankajkoti in #1880 * Add deprecation warning for ``LoadMode.CUSTOM`` parser by @duongphannamhung in #1885 * CI: Add GitHub CodeQL analysis workflow (codeql.yml) by @pankajkoti in #1871 * Resolve 'credential persistence through GitHub Actions artifacts' warnings from Zizmor by @pankajkoti in #1890 * Resolve 'overly broad permissions' warnings from Zizmor by @pankajkoti in #1889 * Resolve Zizmor error alerts for unpinned image references; mark alert for pull_request_target ignored by @pankajkoti in #1888 * Fix broken CI ``tests.py3.11-2.8-1.9:test-integration-setup`` by @tatiana in #1902 * Add dbt-core 1.10 to test matrix by @tatiana in #1767 * Pin package dbt-databricks by @pankajastro in #1909 * Enable matrix test entry for dbt-1.9, python-3.9 and airflow-3.0 tests in CI by @pankajastro in #1900 * Pre-commit updates: #1779, #1795, #1800, #1857, #1863, #1869, #1892, #1901 * Dependabot updates: #1904 Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
**Bug Fixes** * Fix task instance ``try_number`` attribute for Airflow 3 compatibility by @pankajkoti in #1781 * Fix rendered template override logic when ``should_store_compiled_sql=False`` to restore pre-refactor behaviour by @pankajkoti in #1777 * Fix ``ProfileConfig`` in GCP Cloud Run job execution mode by @ramonvermeulen in #1783 * Fix dbt Docs page height by @1cadumagalhaes in #1793 * Add support to base64 encoded pem in Snowflake profiles by @brunocmartins in #1801 * Allow to disable owner inheritance from dbt into airflow DAG owners by @CorsettiS in #1787 * Fix Kubernetes Pod Operator conversion of ``container_resources`` to ``resources`` by @johnhoran in #1821 * Fix ``dbt deps`` with project level variables by @AlexandrKhabarov in #1822 * Fix source freshness warnings in kubernetes execution mode by @Pawel-Drabczyk in #1859 * Fix: Harden DbtNode against null config/meta by @pankajkoti in #1877 * Fix cache behaviour when DAG name contains "." by @tatiana in #1908 **Documentation** * Fix ``contributing.rst`` docs by @tatiana in #1785 * Fix docs rendering in Airflow 3 Compatibility by @pankajastro in #1790 * Fix typo in ``selecting-excluding.rst`` by @msshroff in #1814 * Update testing behavior file with ``ExecutionMode.KUBERNETES`` by @LuigiCerone in #1813 * Add step to fork repo in contributing guide by @pankajastro in #1808 * Fix ``depends_on`` attribute by @benedikt-buchert in #1837 * Fix character name by @ThePsyjo in #1860 * Update suggested MWAA startup script by @jaklan in #1884 * Make implementation & docs consistent regarding ``use_dataset_airflow3_uri_standard`` by @Anti0ff in #1878 **Others** * Set retries to 0 in example DAGs by @pankajkoti in #1782 * Fix ``test_async_example_dag_without_setup_task`` tests by @pankajastro in #1788 * Fix test hash value for Darwin when using Py 3.12.10 by @tatiana in #1786 * Upgrade Python and Airflow used to run MyPy checks by @tatiana in #1796 * Assert example DAGs' ``DagRunState`` and fix issues by @pankajkoti and @tatiana in #1778 * Update the conflict matrix to include AF 2.10, 2.11 & 3.0 and dbt 1.9 & 1.10 by @tatiana in #1820 * Fix broken CI due to Pydantic conflicts by @tatiana in #1809 * Drop Python 3.8 Support by @pankajastro in #1852 * Add Airflow 2.11 to the test matrix by @tatiana in #1807 * Require Authorize for all jobs on pull requests from external contributors in CI by @pankajkoti in #1861 * Leverage Trusted Publisher Management when publishing PyPI package by @tatiana in #1862 * CI: Add back accidentally deleted python-version matrix for running unit tests by @pankajkoti in #1872 * Remove commented code and fix mypy failures by @pankajkoti in #1876 * Add Zizmor analysis GitHub action by @pankajkoti in #1870 * Catch FlushError on Datasets for Airflow 2.11 dags test by @pankajkoti in #1880 * Add deprecation warning for ``LoadMode.CUSTOM`` parser by @duongphannamhung in #1885 * CI: Add GitHub CodeQL analysis workflow (codeql.yml) by @pankajkoti in #1871 * Resolve 'credential persistence through GitHub Actions artifacts' warnings from Zizmor by @pankajkoti in #1890 * Resolve 'overly broad permissions' warnings from Zizmor by @pankajkoti in #1889 * Resolve Zizmor error alerts for unpinned image references; mark alert for pull_request_target ignored by @pankajkoti in #1888 * Fix broken CI ``tests.py3.11-2.8-1.9:test-integration-setup`` by @tatiana in #1902 * Add dbt-core 1.10 to test matrix by @tatiana in #1767 * Pin package dbt-databricks by @pankajastro in #1909 * Enable matrix test entry for dbt-1.9, python-3.9 and airflow-3.0 tests in CI by @pankajastro in #1900 * Pre-commit updates: #1779, #1795, #1800, #1857, #1863, #1869, #1892, #1901 * Dependabot updates: #1904 Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
Description
Slightly odd and I'm not sure if this is the best solution, but this is the code base I have access to, so lets go with it.
I'm trying to specify resource limits in cosmos using https://astronomer.github.io/astronomer-cosmos/configuration/operator-args.html#overriding-operator-arguments-per-dbt-node-or-group-of-nodes, specifically I'm adding
to a DBT project and when running locally using
astro dev startthis works correctly. However when running in astronomer hosted, it doesn't apply. Researching further if I have an operator withthen the resources will be overridden with the defaults set on the deployment in astronomer hosted, though running locally it works correctly. If I specify them as
then it all works correctly, but this precludes my ability to set them in cosmos as the argument will always be passed as a dictionary there.
It seems whatever code astronomer is using on the backend to apply the default container resources doesn't work with anything except
V1ResourceRequirements. Arguably this is correct behaviour since the signature in the kubernetes pod operator specifies the type to beV1ResourceRequirements | None, even though the code itself does work with dictionaries.Anyway the proposed solution is to convert this argument to the correct type before it is passed to the kubernetes pod operator. Thoughts?