-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qualified wildcard recognized as wrong column name #423
Comments
Thanks for reporting this. I can confirm this is a bug we should fix. |
I found the problem is with the second statement: FROM #TEMP_CUSTOMERS CUST AS CUST @crossxwill Can you help confirm this is valid syntax instead of FROM #TEMP_CUSTOMERS AS CUST For this case, we actually should throw exception because there're parsing errors. Later if this proves to be valid syntax, we can fix the parser. |
You found a typo in my example. I fixed it. The bug still remains. |
Now with #429 merged, we will raise InvalidSyntaxException for previously buggy sql: SELECT CUST.*
, SALES.CUM_SALES
INTO #TEMP_FINAL_RESULTS
FROM #TEMP_CUSTOMERS CUST AS CUST
LEFT JOIN T_SALES AS SALES
ON CUST.CUST_ID = SALES.CUST_ID
WHERE SALES.CUST_ID IS NULL;
|
Back to this story, the problem is still limited to second statement. It seems we have some issue handling alias with SELECT INTO statement. Using code in master branch, this is the output:
Whereas the expected output should be:
|
The problem is not with SELECT INTO, rather all qualified wildcard suffered from the same issue: INSERT INTO tab1
SELECT tab2.*
FROM tab2 a
INNER JOIN tab3 b
ON a.id = b.id
The correct output should be:
|
The fixed result would be $ sqllineage -f test.sql -l column --dialect=tsql
<default>.#temp_customers.cust_city <- <default>.t_customers.cust_city
<default>.#temp_customers.cust_country <- <default>.t_customers.cust_country
<default>.#temp_customers.cust_id <- <default>.t_customers.cust_id
<default>.#temp_customers.cust_name <- <default>.t_customers.cust_name
<default>.#temp_final_results.* <- <default>.#temp_customers.*
<default>.#temp_final_results.cum_sales <- <default>.t_sales.cum_sales |
The following SQL query uses "sales" as an alias for "T_SALES". How could I ask the
LineageRunner()
to return the table/view name rather than the alias?Expected output:
<default>.t_customers.cust_city
<default>.#temp_customers.cust_city
<default>.t_customers.cust_country
<default>.#temp_customers.cust_country
<default>.t_customers.cust_id
<default>.#temp_customers.cust_id
<default>.t_customers.cust_name
<default>.#temp_customers.cust_name
<default>.
t_sales.cum_sales<default>.#temp_final_results.cum_sales
<default>.#temp_customers.cust.*
<default>.#temp_final_results.cust.*
Actual output:
<default>.t_customers.cust_city
<default>.#temp_customers.cust_city
<default>.t_customers.cust_country
<default>.#temp_customers.cust_country
<default>.t_customers.cust_id
<default>.#temp_customers.cust_id
<default>.t_customers.cust_name
<default>.#temp_customers.cust_name
<default>.sales.cum_sales
<default>.#temp_final_results.cum_sales
<default>.#temp_customers.cust.*
<default>.#temp_final_results.cust.*
The text was updated successfully, but these errors were encountered: