Skip to content

Conversation

@ahkcs
Copy link
Contributor

@ahkcs ahkcs commented Nov 19, 2025


Description

This PR implements the mvdedup eval function for PPL, enabling users to remove duplicate values from multivalue arrays.


Behavior

Given the input:

source=index | eval result = mvdedup(array(1, 2, 2, 3, 1, 4))

The function returns:

[1, 2, 3, 4]

Key Details

  • Empty arrays return empty arrays

  • Order-preserving: The first appearance of each value is kept; subsequent duplicates are removed.


Example

Input Output
array(1, 2, 2, 3, 1, 4) [1, 2, 3, 4]
array() []

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: Kai Huang <[email protected]>
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if this has same semantic as ARRAY_DISTINCT?

2: jdbc:calcite:model=src/test/resources/mode> SELECT ARRAY_DISTINCT(ARRAY[1,2,1,3,3,4]);
+--------------+
|    EXPR$0    |
+--------------+
| [1, 2, 3, 4] |
+--------------+
1 row selected (0.018 seconds)

2: jdbc:calcite:model=src/test/resources/mode> SELECT ARRAY_DISTINCT(ARRAY[4,1,2,1,3,3,4]);
+--------------+
|    EXPR$0    |
+--------------+
| [4, 1, 2, 3] |
+--------------+

@dai-chen dai-chen added the enhancement New feature or request label Nov 20, 2025
@dai-chen dai-chen added the PPL Piped processing language label Nov 20, 2025
@ahkcs
Copy link
Contributor Author

ahkcs commented Nov 20, 2025

Could you check if this has same semantic as ARRAY_DISTINCT?

Hi @dai-chen, thanks for the suggestion! After checking, I think ARRAY_DISTINCT is suitable for our implementation for mvdedup eval function, I have updated the implementation to remove the UDF and use ARRAY_DISTINCT for implementation.

cc @ykmr1224

@ahkcs ahkcs requested review from dai-chen and ykmr1224 November 20, 2025 18:41
Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@dai-chen
Copy link
Collaborator

Do we need to backport?

@ahkcs
Copy link
Contributor Author

ahkcs commented Nov 21, 2025

Do we need to backport?

Yes

@dai-chen dai-chen merged commit 5049a03 into opensearch-project:main Nov 24, 2025
58 of 62 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 24, 2025
* Support  eval function

Signed-off-by: Kai Huang <[email protected]>

* Updates

Signed-off-by: Kai Huang <[email protected]>

* update javadoc

Signed-off-by: Kai Huang <[email protected]>

* Update to use ARRAY_DISTINCT

Signed-off-by: Kai Huang <[email protected]>

---------

Signed-off-by: Kai Huang <[email protected]>
(cherry picked from commit 5049a03)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
LantaoJin pushed a commit that referenced this pull request Nov 25, 2025
* Support  eval function



* Updates



* update javadoc



* Update to use ARRAY_DISTINCT



---------


(cherry picked from commit 5049a03)

Signed-off-by: Kai Huang <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
asifabashar pushed a commit to asifabashar/sql that referenced this pull request Dec 10, 2025
* Support  eval function

Signed-off-by: Kai Huang <[email protected]>

* Updates

Signed-off-by: Kai Huang <[email protected]>

* update javadoc

Signed-off-by: Kai Huang <[email protected]>

* Update to use ARRAY_DISTINCT

Signed-off-by: Kai Huang <[email protected]>

---------

Signed-off-by: Kai Huang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants