Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Lateral Column Alias Reference Analyzing #507

Closed
maoxingda opened this issue Dec 19, 2023 · 2 comments · Fixed by #521
Closed

Support Lateral Column Alias Reference Analyzing #507

maoxingda opened this issue Dec 19, 2023 · 2 comments · Fixed by #521
Labels
enhancement New feature or request

Comments

@maoxingda
Copy link
Contributor

maoxingda commented Dec 19, 2023

SQL

insert into public.tgt_tbl1
(
    id
)
select
    sq.id
from
    (
        select
            name      as user_name,
            user_name as id -- backward reference
        from
            public.src_tbl1
    ) sq
;

To Reproduce
Note here we refer to SQL provided in prior step as stored in a file named test.sql

from sqllineage.runner import LineageRunner
with open("test.sql") as f:
    sql = f.read()
lr = LineageRunner(sql, dialect='redshift')
lr.print_column_lineage()
public.tgt_tbl1.id <- sq.id <- public.src_tbl1.user_name

Expected behavior

public.tgt_tbl1.id <- sq.id <- public.src_tbl1.name

Python version (available via python --version)

  • 3.11.5

SQLLineage version (available via sqllineage --version):

  • 1.4.9
@maoxingda maoxingda added the bug Something isn't working label Dec 19, 2023
@maoxingda maoxingda changed the title When there is a forward reference in the select list, the column lineage is inaccurate. When there is a backward reference in the select list, the column lineage is inaccurate. Dec 21, 2023
@reata reata changed the title When there is a backward reference in the select list, the column lineage is inaccurate. Support Lateral Column Alias Reference Analyzing Dec 23, 2023
@reata
Copy link
Owner

reata commented Dec 23, 2023

This feature is officially called lateral column alias reference, see https://aws.amazon.com/about-aws/whats-new/2018/08/amazon-redshift-announces-support-for-lateral-column-alias-reference/

It's not universally supported by every SQL dialects. We will consider adding support but it won't be in high priority.

@reata reata added enhancement New feature or request and removed bug Something isn't working labels Dec 23, 2023
@reata
Copy link
Owner

reata commented Dec 26, 2023

Tried a few open source SQL database/data warehouse, only sparksql support this feature at the end of 2023.

dialect version support lateral column alias reference
mysql 8.2.0 no
postgres 16.1 no
hive 3.1.3 no
sparksql 3.5.0 yes
trino 435 no

SparkSQL support was added since 3.4.0 via SPARK-27561, released April 13, 2023.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants