-
Notifications
You must be signed in to change notification settings - Fork 15.9k
Description
Apache Airflow version
3.0.2
If "Other Airflow 2 version" selected, which one?
No response
What happened?
When creating an Asset using an ObjectStoragePath as the uri parameter if a connection is in use then the _sanitize_uri method used to validate the URI fails with:
"An Asset URI should not contain auth info (e.g. username or password). It has been automatically dropped."
This seems to be due to using a connection_id as here in _sanitize_uri any Asset uri with a userinfo element get's flagged.
What you think should happen instead?
At minimum the message should be clear that it has only removed the userinfo from the Asset uri and it has still been created. As it stands the warning is ambiguous as to what has been "dropped" with the Asset itself being the most likely.
Mangling the user's input, in some cases silently, is just a bad idea. If it's accepted with a warning the implication is it was stored as-is. If userinfo just won't be handled it should be rejected as an error.
Assets should really just accept any valid URI and definitely those generated by other parts of Airflow. As it stands I can't store a URI with an Asset and recover it as s3://conn_1@bucket/data s3://conn_2@bucket/data aren't necessarily the same.
How to reproduce
def test_objectstoragepath_asset():
path = ObjectStoragePath("s3://example/", conn_id="test")
asset = Asset(uri=path)
assert asset.uri == path.as_uri()'s3://example/' != 's3://test@example/'
Expected :'s3://test@example/'
Actual :'s3://example/'
Operating System
N/A
Versions of Apache Airflow Providers
No response
Deployment
Docker-Compose
Deployment details
On an OpenStack internal cloud using third-party S3 store.
Anything else?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct