You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
v_sql="""INSERT OVERWRITE TABLE target SELECT NVL(PROV_CODE,'999') aaa,COUNT(DISTINCT MSISDN) bbb FROM (SELECT NVL(PROV_CODE,'000') PROV_CODE,A.MSISDN FROM (SELECT MSISDN FROM ( SELECT MSISDN, BUSI_ID FROM source1 A UNION ALL SELECT concat(A.MSISDN,'20230826') MSISDN,A.MUSIC_BUSI_CODE FROM source2 A union all SELECT concat(A.MSISDN,'20230826') MSISDN, BUSI_code FROM source4 A) C GROUP BY MSISDN) ALEFT JOIN source3 BON SUBSTRING(A.MSISDN,1,7)=B.MSISDN_NBR_PAR) T GROUP BY PROV_CODE GROUPING SETS ((),PROV_CODE);"""fromsqllineage.runnerimportLineageRunnerparse=LineageRunner(sql=v_sql,dialect='sparksql')
parse.print_column_lineage()
Looks like we have some issue with upper case alias used together with UNION. A minimal example with same issue:
INSERT OVERWRITE TABLE TARGET
SELECT MSISDN, BUSI_ID
FROM SOURCE1
UNION ALLSELECT CONCAT(A.MSISDN,'20230826') MSISDN, A.MUSIC_BUSI_CODEFROM SOURCE2 A
Change alias A to lower case a generate correct output.
column lineage aaa is error. should be is source3
Right we we say we don't know if aaa is from subquery a or table source3. But actually we can be smarter, because a is subquery contains only one column named msisdn, which makes table source3 the only possibility. But this "smart logic" is not in our code yet.
Describe the bug
For example:
but if sql is lower():
Expected behavior
bbb
is erroraaa
is error. should be is source3Python version (available via
python --version
)SQLLineage version (available via
sqllineage --version
):The text was updated successfully, but these errors were encountered: