Deprecate delimiter param and source object's wildcards in GCS, introduce match_glob param.
#31261
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
closes: #29115
Important notes
delimiterparameter and wildcards in source objects is essential. These features are not native to the GCS API; they were implemented as a workaround that now heavily misuses thedelimiterparameter. This implementation likely stems from the fact that the new parameter,match_glob, did not exist when these features were initially implemented. By utilizing this parameter instead ofdelimiter- the original issue resolves.match_globparam is not supported by the official GCS Python API today. To deal with it, I copied and patched list_blob() directly from their source code to the hook. I'm unsure if it's OK to do regarding licensing and maintainability from our side. If it's fine, let me know if I need to add any additional licensing comments - otherwise, we'll have to wait until the official release (according to GCP's comment on my issue - it should be around Q3).GCSToGCSOperator,GCSToSFTPOperator,GCSToGoogleDriveOperator), there is an internal logic that deals with wildcards in the store object(s) and calls thelist()method with thedelimiterparam. To avoid any chances of breaking existing behavior, I intentionally avoided using thematch_globparam there. I added commentary that instructs what should be changed when finalizing the deprecation.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.