Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Provider to Assist Column Lineage Analysis #477

Closed
reata opened this issue Oct 25, 2023 · 1 comment
Closed

Metadata Provider to Assist Column Lineage Analysis #477

reata opened this issue Oct 25, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@reata
Copy link
Owner

reata commented Oct 25, 2023

Quoting sqllineage docs:

Column-level lineage will not be 100% accurate because that would require metadata information. However, there’s no unified metadata service for all kinds of SQL systems. For the moment, in column-level lineage, column-to-table resolution is conducted in a best-effort way, meaning we only provide possible table candidates for situation like select * or select col from tab1 join tab2.

Proposed Solution:
To build a metadata provider interface, that returns all the columns given a table name. The implementation can vary, from the naive provider where user store the metadata in a dictionary, to more complex ones that queries metadata service (like query hive metastore via thrift API, execute show tables SQL, query information_schema, etc.)

This way, user can register their metadata for sqllineage to resolve during lineage analysis.

We'll start with the naive solution, which walks us through the most common part. And ultimately try to provide common implementation like HiveMetaStoreMetaDataProvider and SQLAlchemyMetaDataProvider so user just need to feed in things like database url to enjoy the accurate column lineage with metadata assistance.

This will be the major feature for v1.5.x release.

@reata
Copy link
Owner Author

reata commented Dec 28, 2023

Feature List:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant