Skip to content

Spark: IN clause on system function is not pushed down#9192

Closed
tmnd1991 wants to merge 1 commit intoapache:mainfrom
tmnd1991:feature/9191
Closed

Spark: IN clause on system function is not pushed down#9192
tmnd1991 wants to merge 1 commit intoapache:mainfrom
tmnd1991:feature/9191

Conversation

@tmnd1991
Copy link

@tmnd1991 tmnd1991 commented Dec 1, 2023

PR to verify bug reported in issue #9191.

With some guidance I'm open to work on the fix too.

@github-actions github-actions bot added the spark label Dec 1, 2023
@tmnd1991 tmnd1991 marked this pull request as draft December 6, 2023 12:02
@tmnd1991
Copy link
Author

tmnd1991 commented Dec 7, 2023

Can I have a feedback on this one @ConeyLiu, too? Thanks a lot

@ConeyLiu
Copy link
Contributor

ConeyLiu commented Dec 7, 2023

@tmnd1991 thanks for your contribution. I think this is a valid fix.

@tmnd1991
Copy link
Author

tmnd1991 commented Dec 7, 2023

cc @nastra @dramaticlly @advancedxy for review
thanks

Copy link
Contributor

@advancedxy advancedxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two minor comments. Overall, it's in good shape.

@tmnd1991
Copy link
Author

tmnd1991 commented Dec 8, 2023

Thanks for the review, I addressed your concerns. If I get green light I proceed to copy-paste over 3.4

Copy link
Contributor

@wypoon wypoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix seems to make sense.
Just one comment about a test and some nits (also, please note that the Iceberg project doesn't use wildcard imports, so you should put back all the explicit imports plus additional ones needed).

if canReplace(systemFunction) && values.forall(_.foldable) =>
in.copy(value = replaceStaticInvoke(systemFunction))

case in@InSet(systemFunction: StaticInvoke, _) if canReplace(systemFunction) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in @ InSet to be consistent with the style.

@tmnd1991
Copy link
Author

Thanks @wypoon for the review, I addressed your concerns

Copy link
Contributor

@wypoon wypoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise.

filter.copy(condition = newCondition)
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary (and undesirable) whitespace change.

}

private void testBucketLongFunctionIsNotReplacedWhenArgumentsAreNotLiterals() {
List<Integer> range = IntStream.range(0, 3).boxed().collect(Collectors.toList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range is not used. (I pointed this out before.)

filter.copy(condition = newCondition)
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary whitespace change.

}

private void testBucketLongFunctionIsNotReplacedWhenArgumentsAreNotLiterals() {
List<Integer> range = IntStream.range(0, 3).boxed().collect(Collectors.toList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used.

@wypoon
Copy link
Contributor

wypoon commented Dec 20, 2023

@rdblue @szehon-ho @RussellSpitzer can you please review this? Also, can you please enable the CI for this PR?

@tmnd1991 tmnd1991 marked this pull request as ready for review January 23, 2024 09:10
@tmnd1991
Copy link
Author

any chance to have this reviewed?

@nastra
Copy link
Contributor

nastra commented Jan 31, 2024

@aokolnychyi could you take a look at this please?

@nastra
Copy link
Contributor

nastra commented Mar 7, 2024

just FYI, looks like there's a related PR that @aokolnychyi opened for this: #9873

@tmnd1991
Copy link
Author

tmnd1991 commented Mar 7, 2024

just FYI, looks like there's a related PR that @aokolnychyi opened for this: #9873

thanks, watching that 🙏

reformat

Add impl and fix test

Work also in the IN_SET case

Add missing `canReplace` check in `In` case
Refactor tests

Apply suggestions

Revert

Add tests when replace should not happen

Fix code style issues

Fix code style issues

Copy over 3.4

Fixed unused variables and whitespace changes
@tmnd1991
Copy link
Author

@aokolnychyi I see you fixed part of this in #9873 but Spark 3.4 looks stil bugged on main. Do you have any time to give me a feedback on this?

@github-actions
Copy link

github-actions bot commented Oct 6, 2024

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 6, 2024
@github-actions
Copy link

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants