Skip to content

Fix import handling by lazy loading hooks introduced in PR #1109#1132

Merged
pankajkoti merged 4 commits into
astronomer:mainfrom
dwreeves:lazy-load-hooks
Aug 2, 2024
Merged

Fix import handling by lazy loading hooks introduced in PR #1109#1132
pankajkoti merged 4 commits into
astronomer:mainfrom
dwreeves:lazy-load-hooks

Conversation

@dwreeves
Copy link
Copy Markdown
Collaborator

@dwreeves dwreeves commented Aug 1, 2024

Description

Making an update to #1109, which introduced module-level imports of optional dependencies. This is inappropriate as it will break if the user does not have them installed, and indeed the user really does not need them installed if they are not relying on them directly.

This PR lazy-loads the imports so that it does not impact users who do not need them.

Additionally, the scheme added for Azure only supported the Azure Data Lake Storage V2 protocol and not the (legacy, but also I believe more common?) wasb:// protocol, which I added. Additionally, the conn_id was being pulled from the wrong hook for the abfs:// scheme. This is not just nitpicking, as the default conn_id for each hook is actually different: the conn_id that corresponds with the abfs:// scheme is adls_default, whereas for the wasb:// scheme it is wasb_default.

This PR should be merged before releasing 1.6 to prevent breaking anyone's environments. 😄 Sorry to sound the bold alarms, but this one is actually pretty important.

@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Aug 1, 2024
@dwreeves dwreeves requested a review from pankajkoti August 1, 2024 04:14
@netlify
Copy link
Copy Markdown

netlify Bot commented Aug 1, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit 1918af2
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/66aba85c086eb5000753c291

@dosubot dosubot Bot added the area:dependencies Related to dependencies, like Python packages, library versions, etc label Aug 1, 2024
Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dwreeves for trying out from main and finding an issue. Many thanks for going ahead and creating a fix ❤️

LGTM, just have a question regarding the wasb URI format we could use and then I believe we are good to merge it.

Comment thread cosmos/constants.py
@dwreeves
Copy link
Copy Markdown
Collaborator Author

dwreeves commented Aug 1, 2024

@pankajkoti Thank you for the very speedy review. As per my comment above I looked into things a little bit more and incorporated your bit of feedback, plus learned a few more things (looks like abfs, abfss and adl can all be supported?)

Copy link
Copy Markdown
Contributor

@pankajkoti pankajkoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thanks for identifying this issue and fixing it so promptly & proactively! Appreciate it a lot! I'm glad we've avoiding affecting users before the release 😌

Comment thread cosmos/constants.py Outdated
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label Aug 1, 2024
dwreeves and others added 2 commits August 1, 2024 11:22
@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.51%. Comparing base (7889fbe) to head (1918af2).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1132   +/-   ##
=======================================
  Coverage   96.50%   96.51%           
=======================================
  Files          64       64           
  Lines        3321     3325    +4     
=======================================
+ Hits         3205     3209    +4     
  Misses        116      116           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pankajkoti
Copy link
Copy Markdown
Contributor

@dwreeves could we merge the PR? Or is there something more you'd like to add here?

@dwreeves
Copy link
Copy Markdown
Collaborator Author

dwreeves commented Aug 2, 2024

@pankajkoti It's fine to merge!

@pankajkoti pankajkoti changed the title lazy load hooks Fix import handling by lazy loading hooks introduced in PR #1109 Aug 2, 2024
@pankajkoti pankajkoti merged commit 60148ce into astronomer:main Aug 2, 2024
dwreeves added a commit to dwreeves/astronomer-cosmos that referenced this pull request Aug 4, 2024
…#1109 (astronomer#1132)

Making an update to astronomer#1109, which introduced module-level imports of
optional dependencies. This is inappropriate as it will break if the
user does not have them installed, and indeed the user really does not
need them installed if they are not relying on them directly.

This PR lazy-loads the imports so that it does not impact users who do
not need them.

In the upath library, `az:`, `adl:`, `abfs:` and `abfss:` are also all valid schemes, 
albeit Airflow only references the latter 3 in the code: https://github.com/apache/airflow/blob/e3824eaaba7eada9a807f7a2f9f89d977a210e15/airflow/providers/microsoft/azure/fs/adls.py#L29, so `adl:`, `abfs:` and `abfss:` also have been added
to the list of schemes supported.
tatiana pushed a commit that referenced this pull request Aug 14, 2024
Making an update to #1109, which introduced module-level imports of
optional dependencies. This is inappropriate as it will break if the
user does not have them installed, and indeed the user really does not
need them installed if they are not relying on them directly.

This PR lazy-loads the imports so that it does not impact users who do
not need them.

In the upath library, `az:`, `adl:`, `abfs:` and `abfss:` are also all valid schemes, 
albeit Airflow only references the latter 3 in the code: https://github.com/apache/airflow/blob/e3824eaaba7eada9a807f7a2f9f89d977a210e15/airflow/providers/microsoft/azure/fs/adls.py#L29, so `adl:`, `abfs:` and `abfss:` also have been added
to the list of schemes supported.
@pankajkoti pankajkoti mentioned this pull request Aug 16, 2024
pankajkoti added a commit that referenced this pull request Aug 20, 2024
New Features

* Add support for loading manifest from cloud stores using Airflow
Object Storage by @pankajkoti in #1109
* Cache ``package-lock.yml`` file by @pankajastro in #1086
* Support persisting the ``LoadMode.VIRTUALENV`` directory by @tatiana
in #1079
* Add support to store and fetch ``dbt ls`` cache in remote stores by
@pankajkoti in #1147
* Add default source nodes rendering by @arojasb3 in #1107
* Add Teradata ``ProfileMapping`` by @sc250072 in #1077

Enhancements

* Add ``DatabricksOauthProfileMapping`` profile by @CorsettiS in #1091
* Use ``dbt ls`` as the default parser when ``profile_config`` is
provided by @pankajastro in #1101
* Add task owner to dbt operators by @wornjs in #1082
* Extend Cosmos custom selector to support + when using paths and tags
by @mvictoria in #1150
* Simplify logging by @dwreeves in #1108

Bug fixes

* Fix Teradata ``ProfileMapping`` target invalid issue by @sc250072 in
#1088
* Fix empty tag in case of custom parser by @pankajastro in #1100
* Fix ``dbt deps`` of ``LoadMode.DBT_LS`` should use
``ProjectConfig.dbt_vars`` by @tatiana in #1114
* Fix import handling by lazy loading hooks introduced in PR #1109 by
@dwreeves in #1132
* Fix Airflow 2.10 regression and add Airflow 2.10 in test matrix by
@pankajastro in #1162

Docs

* Fix typo in azure-container-instance docs by @pankajastro in #1106
* Use Airflow trademark as it has been registered by @pankajastro in
#1105

Others

* Run some example DAGs in Kubernetes execution mode in CI by
@pankajastro in #1127
* Install requirements.txt by default during dev env spin up by
@@CorsettiS in #1099
* Remove ``DbtGraph.current_version`` dead code by @tatiana in #1111
* Disable test for Airflow-2.5 and Python-3.11 combination in CI by
@pankajastro in #1124
* Pre-commit hook updates in #1074, #1113, #1125, #1144, #1154,  #1167

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Co-authored-by: Pankaj Singh <98807258+pankajastro@users.noreply.github.com>
@tatiana tatiana added this to the Cosmos 1.6.0 milestone Sep 25, 2024
pankajastro pushed a commit that referenced this pull request Jun 2, 2026
Cosmos exposes execution-mode operators (Kubernetes, AWS EKS/ECS, GCP
Cloud Run Job, Azure Container Instance, Docker) and a wide set of
profile mappings that depend on optional Airflow provider packages and
cloud SDKs. Importing these at the module level turns each provider into
a hard runtime dependency for every Cosmos user, even those who never
use that feature - see PR #1109 / #1132 for the prior history of this
concern.

As of now, `cosmos/__init__.py` already defers the heavy operator
modules via `_LAZY_IMPORTS` + `__getattr__`, but nothing previously
prevented a future PR from adding a fresh eager provider import in a
non-leaf module and quietly broadening the dependency surface.

This PR enables ruff's `TID253` (`banned-module-level-imports`) with the
provider/SDK packages we ship with optional-extra installs, and adds
explicit `per-file-ignores` for the leaf opt-in modules that
legitimately need those imports at the module level.

**Banned at module level**

- Airflow provider packages: `amazon`, `cncf.kubernetes`, `databricks`,
`docker`, `google`, `microsoft.azure`, `snowflake`.
- Cloud/orchestration SDKs: `azure`, `boto3`, `botocore`,
`databricks.sql`, `docker`, `google.cloud`, `google.auth`, `kubernetes`,
`snowflake.connector`.
- Cloud filesystems: `adlfs`, `gcsfs`, `s3fs`.
- Other heavy optional deps: `mlflow`, `sentry_sdk`.

**Where they're allowed (per-file-ignores)**

The leaf opt-in modules accessed lazily through `cosmos/__init__.py`:

- `cosmos/operators/kubernetes.py`
- `cosmos/operators/aws_eks.py`
- `cosmos/operators/aws_ecs.py`
- `cosmos/operators/azure_container_instance.py`
- `cosmos/operators/gcp_cloud_run_job.py`
- `cosmos/operators/docker.py`
- `cosmos/operators/watcher_kubernetes.py`
- `cosmos/airflow/_override.py`

Plus `tests/**`, `dev/**`, and `scripts/**` (not shipped to users).

**What's NOT affected**

- `if TYPE_CHECKING:` blocks — not module-level execution.
- `try: ... except ImportError:` compat shims (e.g. the Airflow 2/3
`BaseHook` split in `cosmos/profiles/base.py`) - different code shape,
not matched by TID253.
- Function/method-level imports — already deferred.

## Verification

- `pre-commit run ruff-check --all-files` passes on `main` with the new
config.
- A deliberate violation injected into `cosmos/operators/local.py`
(`import kubernetes` at module top) produced a clear error:

  ```
  TID253 `kubernetes` is banned at the module level
   --> cosmos/operators/local.py:3:8
  ```

Follow-up from the audit prompted by PR #1109 / PR #1132.

🤖 Generated with Claude Code (https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dependencies Related to dependencies, like Python packages, library versions, etc lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants