Skip to content

Asset creation from an ObjectStoragePath doesn't work if a Connection is in use #51877

@baylisscg

Description

@baylisscg

Apache Airflow version

3.0.2

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When creating an Asset using an ObjectStoragePath as the uri parameter if a connection is in use then the _sanitize_uri method used to validate the URI fails with:

"An Asset URI should not contain auth info (e.g. username or password). It has been automatically dropped."

This seems to be due to using a connection_id as here in _sanitize_uri any Asset uri with a userinfo element get's flagged.

What you think should happen instead?

At minimum the message should be clear that it has only removed the userinfo from the Asset uri and it has still been created. As it stands the warning is ambiguous as to what has been "dropped" with the Asset itself being the most likely.

Mangling the user's input, in some cases silently, is just a bad idea. If it's accepted with a warning the implication is it was stored as-is. If userinfo just won't be handled it should be rejected as an error.

Assets should really just accept any valid URI and definitely those generated by other parts of Airflow. As it stands I can't store a URI with an Asset and recover it as s3://conn_1@bucket/data s3://conn_2@bucket/data aren't necessarily the same.

How to reproduce

def test_objectstoragepath_asset():
    path = ObjectStoragePath("s3://example/", conn_id="test")
    asset = Asset(uri=path)
    assert asset.uri == path.as_uri()
's3://example/' != 's3://test@example/'

Expected :'s3://test@example/'
Actual   :'s3://example/'

Operating System

N/A

Versions of Apache Airflow Providers

No response

Deployment

Docker-Compose

Deployment details

On an OpenStack internal cloud using third-party S3 store.

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions