Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Source Table for MERGE statement when UNION involved in source subquery #406

Closed
entilo opened this issue Jul 6, 2023 · 8 comments · Fixed by #417
Closed

Missing Source Table for MERGE statement when UNION involved in source subquery #406

entilo opened this issue Jul 6, 2023 · 8 comments · Fixed by #417
Labels
bug Something isn't working

Comments

@entilo
Copy link

entilo commented Jul 6, 2023

version: sqllineage 1.4.2

when i use LineageRunner to parse sql below, it turned out error like:"return name.strip("`").strip('"').strip("'") AttributeError: 'NoneType' object has no attribute 'strip'":

create or replace view scn.vwn as
SELECT 1 as file_type
FROM scn.tbn

LineageRunner parse error

@entilo
Copy link
Author

entilo commented Jul 6, 2023

error sql:
create or replace view scn.vwn as
SELECT 1 as file_type
FROM scn.tbn

correct sql:
create or replace view scn.vwn as
SELECT 1 file_type
FROM scn.tbn

there is something wrong with the "as", i dont konw why

@entilo entilo changed the title LineageRunner variety AttributeError LineageRunner parse AttributeError Jul 6, 2023
@reata
Copy link
Owner

reata commented Jul 6, 2023

This happens only with default non-validating dialect, where sqlparse defines FILE_TYPE as a builtin data type, thus not allowed as alias name.

Can you try switch to ansi dialect? In the long run, we will deprecate non-validating support.

@reata reata added the question Further information is requested label Jul 6, 2023
@entilo
Copy link
Author

entilo commented Jul 6, 2023 via email

@entilo
Copy link
Author

entilo commented Jul 7, 2023

This happens only with default non-validating dialect, where sqlparse defines FILE_TYPE as a builtin data type, thus not allowed as alias name.

Can you try switch to ansi dialect? In the long run, we will deprecate non-validating support.

Yes , LineageRunner(sql, dialect="ansi") solved this problem. thank you.

But there are another question : "LineageRunner(sql)" is ok to work with sql like below ,but "LineageRunner(sql, dialect="ansi")" not,

merge into tb_report_risk_day t using (
select * from (
select * from taba a
union all
select * from tabb b
)tb
inner join ext_tab et on tb.aaa = et.aaa)
s on (t.cpdm = s.vc_cpdm) when matched then update set t.upd = s.upd;

its seems another error when using dialect="ansi" to do with "union all" :
if segment.type == "bracketed" and is_union(segment):
AttributeError: 'NoneType' object has no attribute 'type'

@reata
Copy link
Owner

reata commented Jul 12, 2023

Can you try upgrading sqllineage to v1.4.5? I believe this is already fixed.

@entilo
Copy link
Author

entilo commented Jul 14, 2023

Can you try upgrading sqllineage to v1.4.5? I believe this is already fixed.
sqllineage 1.4.5 用下面sql进行解析有问题:
merge into tb_report_risk_day t using (
select * from (
select * from taba a
union all
select * from tabb b
)tb
inner join ext_tab et on tb.aaa = et.aaa)
s on (t.cpdm = s.vc_cpdm) when matched then update set t.upd = s.upd;

LineageRunner(sql=veryfy_sql)
源表为:[Table: .ext_tab, Table: .taba, Table: .tabb]
目标表为: .tb_report_risk_day

LineageRunner(sql=veryfy_sql, dialect="ansi")
源表为:[]
目标表为: .tb_report_risk_day

LineageRunner方言模式没有报错,但是解析错误 ==

@reata
Copy link
Owner

reata commented Jul 20, 2023

I can this is a bug that we need to fix. This is related to UNION ALL in source subquery.

@reata reata changed the title LineageRunner parse AttributeError Missing Source Table for MERGE statement when UNION involved in source subquery Jul 20, 2023
@reata reata added bug Something isn't working and removed question Further information is requested labels Jul 20, 2023
@reata reata changed the title Missing Source Table for MERGE statement when UNION involved in source subquery Missing Source Table for UNION in subquery JOIN with other table Jul 24, 2023
@reata
Copy link
Owner

reata commented Jul 24, 2023

I did some research and this test case can be simplified to

SELECT *
FROM (SELECT *
      FROM foo
      UNION ALL
      SELECT *
      FROM bar) s1
         CROSS JOIN baz

where ANSI dialect still has problem identifying all source table correctly, it only recognizes baz.

update: it's still only triggered with merge

@reata reata changed the title Missing Source Table for UNION in subquery JOIN with other table Missing Source Table for MERGE statement when UNION involved in source subquery Jul 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants