Skip to content

Fix join type for correlated subquery#19002

Merged
martint merged 1 commit intotrinodb:masterfrom
Bhargavi-Sagi:bhsagi-correlated-subquery-bug
Oct 18, 2023
Merged

Fix join type for correlated subquery#19002
martint merged 1 commit intotrinodb:masterfrom
Bhargavi-Sagi:bhsagi-correlated-subquery-bug

Conversation

@Bhargavi-Sagi
Copy link
Contributor

@Bhargavi-Sagi Bhargavi-Sagi commented Sep 11, 2023

Description

INNER join is used in subqueryPlanner's planScalarSubquery. But it will result in incorrect results when the subquery returns zero rows.
Example query:

with test_a(id, type, col1) as( values ('123', 'a', 7), ('123', 'a', 8), ('456', 'a', 9)), 
base(claim_id, assess_id) as (values ('AAA', '123'), ('BBB', NULL), ('CCC', '456'), ('DDD', '789'))
select claim_id,
   (select sum(col1) from test_a
   where test_a.type = 'a' and base.assess_id = test_a.id
  group by id 
   ) as cols
from base

Actual results:

 claim_id | cols
----------+------
 AAA      |   15
 CCC      |    9
(2 rows)

Expected results:

 claim_id | cols
----------+------
 AAA      |   15
 BBB      | NULL
 CCC      |    9
 DDD      | NULL
(4 rows)

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( X) Release notes are required, with the following suggested text:

# Section
* Fix incorrect results for query involving an aggregation in a correlated subquery. ({issue}`18236`)

@cla-bot
Copy link

cla-bot bot commented Sep 11, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@zhaner08 zhaner08 requested a review from martint September 12, 2023 15:25
@martint
Copy link
Member

martint commented Sep 12, 2023

Thanks!

Can you add a test? Here's a cleaned up query you can use for the test:

WITH
	t(k, v) AS (VALUES ('A', 1), ('B', NULL), ('C', 2), ('D', 3)),
	u(k, v) AS (VALUES (1, 10), (1, 20), (2, 30))
SELECT
	k,
   	(
   		SELECT max(v)
   		FROM u
   		WHERE t.v = u.k
  		GROUP BY k
   ) AS cols
FROM t

You can add the test to the suite in core/trino-main/src/test/java/io/trino/sql/query

@cla-bot
Copy link

cla-bot bot commented Sep 12, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@Bhargavi-Sagi Bhargavi-Sagi force-pushed the bhsagi-correlated-subquery-bug branch from f133c08 to b81dbc5 Compare September 12, 2023 23:03
@cla-bot
Copy link

cla-bot bot commented Sep 12, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@Bhargavi-Sagi Bhargavi-Sagi force-pushed the bhsagi-correlated-subquery-bug branch from b81dbc5 to a93a6f3 Compare September 13, 2023 19:45
@cla-bot
Copy link

cla-bot bot commented Sep 13, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

correlatedJoinNode.getInput(),
rewrittenSubquery,
correlatedJoinNode.getCorrelation(),
producesSingleRow ? correlatedJoinNode.getType() : LEFT,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What scenario fails due to this?

Copy link
Contributor Author

@Bhargavi-Sagi Bhargavi-Sagi Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a accompanying change(no logic change here). This is needed because in subqueryPlanner correlatedJoinNode type is set to LEFT and now when this line is evaluated correlatedJoinNode.getType() will have LEFT(instead of INNER). Made this change just to preserve the logic of using INNER join when subquery has 1 cardinality and LEFT when it has 0 cardinality.

@martint
Copy link
Member

martint commented Oct 14, 2023

@cla-bot check

@cla-bot
Copy link

cla-bot bot commented Oct 14, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link

cla-bot bot commented Oct 14, 2023

The cla-bot has been summoned, and re-checked this pull request!

@Bhargavi-Sagi
Copy link
Contributor Author

@cla-bot check

@cla-bot
Copy link

cla-bot bot commented Oct 17, 2023

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link

cla-bot bot commented Oct 17, 2023

The cla-bot has been summoned, and re-checked this pull request!

@martint
Copy link
Member

martint commented Oct 18, 2023

@cla-bot check

@cla-bot cla-bot bot added the cla-signed label Oct 18, 2023
@cla-bot
Copy link

cla-bot bot commented Oct 18, 2023

The cla-bot has been summoned, and re-checked this pull request!

@martint martint merged commit cdb4f8f into trinodb:master Oct 18, 2023
@github-actions github-actions bot added this to the 430 milestone Oct 18, 2023
subPlan,
root,
scalarSubquery.getQuery(),
CorrelatedJoinNode.Type.INNER,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually correct code as EnforceSingleRowNode guarantees exactly single row, so from the "perspective" of CorrelatedJoin INNER is OK.

I've cleaned this up in #19932

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants