Skip to content

Conversation

@ykmr1224
Copy link
Collaborator

@ykmr1224 ykmr1224 commented Aug 21, 2025

Description

  • Implements EARLIEST and LATEST aggregate functions for PPL using a ARG_MIN/ARG_MAX.
  • Uses @timestamp field to decide the earliest/latest by default.
  • It accepts optional argument to specify timestamp field to be used.
  • Pushdown for OpenSearch will be implemented in a separate PR.
  • Fixed doctest to enable Calcite.
  • Refactored test_docs.py and enabled Calcite enabled tests separately from existing Calcite disabled cases.

Usage

# Basic usage
source=logs | stats earliest(message), latest(response) by host

# Custom time field
source=metrics | stats latest(cpu_usage, event_time) by server

Related Issues

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ykmr1224 ykmr1224 added PPL Piped processing language feature calcite calcite migration releated labels Aug 22, 2025
@ykmr1224 ykmr1224 force-pushed the earliest-agg branch 2 times, most recently from f3299c4 to a820a9c Compare August 22, 2025 17:15
@ykmr1224 ykmr1224 marked this pull request as ready for review August 25, 2025 15:35
@ykmr1224
Copy link
Collaborator Author

Checking doctest failure.

Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@ykmr1224 ykmr1224 merged commit e2a1132 into opensearch-project:main Aug 28, 2025
23 checks passed
verifyResult(root, expectedResult);

String expectedSparkSql =
"SELECT ARG_MIN(`message`, `@timestamp`) `earliest_message`\n" + "FROM `POST`.`LOGS`";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is bug. I only see min/max_by in SparkSQL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I will address this in the next PR.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4100-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2a1132018b260017e0c387122ef6d8523726c94
# Push it to GitHub
git push --set-upstream origin backport/backport-4100-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4100-to-2.19-dev.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4100-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2a1132018b260017e0c387122ef6d8523726c94
# Push it to GitHub
git push --set-upstream origin backport/backport-4100-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4100-to-2.19-dev.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4100-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2a1132018b260017e0c387122ef6d8523726c94
# Push it to GitHub
git push --set-upstream origin backport/backport-4100-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4100-to-2.19-dev.

@ykmr1224
Copy link
Collaborator Author

backport blocker: #4094

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4100-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2a1132018b260017e0c387122ef6d8523726c94
# Push it to GitHub
git push --set-upstream origin backport/backport-4100-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4100-to-2.19-dev.

@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4100-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2a1132018b260017e0c387122ef6d8523726c94
# Push it to GitHub
git push --set-upstream origin backport/backport-4100-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4100-to-2.19-dev.

ykmr1224 added a commit to ykmr1224/sql that referenced this pull request Sep 3, 2025
)

* Add earliest/latest aggregate function to PPL

Signed-off-by: Tomoyuki Morita <[email protected]>

* Reformat

Signed-off-by: Tomoyuki Morita <[email protected]>

* Add IT

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix integ test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Use ARG_MAX and ARG_MIN instead

Signed-off-by: Tomoyuki Morita <[email protected]>

* Revert comment deletion

Signed-off-by: Tomoyuki Morita <[email protected]>

* Remove min_by and max_by

Signed-off-by: Tomoyuki Morita <[email protected]>

* Update doc

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fit IT

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix doctest

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix doctest

Signed-off-by: Tomoyuki Morita <[email protected]>

* Remove unneeded test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Delete unneeded files

Signed-off-by: Tomoyuki Morita <[email protected]>

* Minor fix

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix issue caused by conflict

Signed-off-by: Tomoyuki Morita <[email protected]>

* reformat

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix doctest issue

Signed-off-by: Tomoyuki Morita <[email protected]>

* Minor fix

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix test_docs

Signed-off-by: Tomoyuki Morita <[email protected]>

* Update test_docs.py

Signed-off-by: Tomoyuki Morita <[email protected]>

* Refactor test_docs.py and fix

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix issues caused by merge

Signed-off-by: Tomoyuki Morita <[email protected]>

---------

Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki MORITA <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
ykmr1224 added a commit to ykmr1224/sql that referenced this pull request Sep 3, 2025
)

* Add earliest/latest aggregate function to PPL

Signed-off-by: Tomoyuki Morita <[email protected]>

* Reformat

Signed-off-by: Tomoyuki Morita <[email protected]>

* Add IT

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix integ test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Use ARG_MAX and ARG_MIN instead

Signed-off-by: Tomoyuki Morita <[email protected]>

* Revert comment deletion

Signed-off-by: Tomoyuki Morita <[email protected]>

* Remove min_by and max_by

Signed-off-by: Tomoyuki Morita <[email protected]>

* Update doc

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fit IT

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix doctest

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix doctest

Signed-off-by: Tomoyuki Morita <[email protected]>

* Remove unneeded test

Signed-off-by: Tomoyuki Morita <[email protected]>

* Delete unneeded files

Signed-off-by: Tomoyuki Morita <[email protected]>

* Minor fix

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix issue caused by conflict

Signed-off-by: Tomoyuki Morita <[email protected]>

* reformat

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix doctest issue

Signed-off-by: Tomoyuki Morita <[email protected]>

* Minor fix

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix tests

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix test_docs

Signed-off-by: Tomoyuki Morita <[email protected]>

* Update test_docs.py

Signed-off-by: Tomoyuki Morita <[email protected]>

* Refactor test_docs.py and fix

Signed-off-by: Tomoyuki Morita <[email protected]>

* Fix issues caused by merge

Signed-off-by: Tomoyuki Morita <[email protected]>

---------

Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki MORITA <[email protected]>
Signed-off-by: Tomoyuki Morita <[email protected]>
@ykmr1224 ykmr1224 added the backport-manually Filed a PR to backport manually. label Sep 3, 2025
penghuo pushed a commit that referenced this pull request Sep 4, 2025
* Add earliest/latest aggregate function to PPL



* Reformat



* Add IT



* Fix integ test



* Use ARG_MAX and ARG_MIN instead



* Revert comment deletion



* Remove min_by and max_by



* Update doc



* Fix tests



* Fix tests



* Fix tests



* Fit IT



* Fix doctest



* Fix doctest



* Remove unneeded test



* Delete unneeded files



* Minor fix



* Fix issue caused by conflict



* reformat



* Fix doctest issue



* Minor fix



* Fix tests



* Fix test_docs



* Update test_docs.py



* Refactor test_docs.py and fix



* Fix issues caused by merge



---------

Signed-off-by: Tomoyuki Morita <[email protected]>
Signed-off-by: Tomoyuki MORITA <[email protected]>
@ykmr1224 ykmr1224 deleted the earliest-agg branch September 4, 2025 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated feature PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants