Update big5 ppl queries#708
Conversation
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
|
nit, could add pr description to explain what is the change? |
+1. @noCharger Please update the PR description to include high-level overview of the changes in this PR. Please also include a test run with |
| "method": "POST", | ||
| "body": { | ||
| "query": "source = {{index_name | default('big5')}} | where `@timestamp` >= '2023-01-02 00:00:00' and `@timestamp` < '2023-01-02 10:00:00' | stats count() by `process.name`, `cloud.region`, `aws.cloudwatch.log_stream` | sort - `process.name`, + `cloud.region`, + `aws.cloudwatch.log_stream` | head 10" | ||
| "query": "source = {{index_name | default('big5')}} | where `@timestamp` >= '2023-01-02 00:00:00' and `@timestamp` < '2023-01-02 10:00:00' | stats {% if distribution_version.split('.') | map('int') | list >= '3.3.0'.split('.') | map('int') | list %}bucket_nullable = false {% endif %}count() by `process.name`, `cloud.region`, `aws.cloudwatch.log_stream` | sort - `process.name`, + `cloud.region`, + `aws.cloudwatch.log_stream` | head 10" |
There was a problem hiding this comment.
Does this need a space between } and c?
{% endif %}count()
There was a problem hiding this comment.
No, the current format is correct and functional.
| "method": "POST", | ||
| "body": { | ||
| "query": "source = {{index_name | default('big5')}} | stats count() as station by `aws.cloudwatch.log_stream` | sort - station | head 500" | ||
| "query": "source = {{index_name | default('big5')}} | stats {% if distribution_version.split('.') | map('int') | list >= '3.3.0'.split('.') | map('int') | list %}bucket_nullable = false {% endif %}count() as station by `aws.cloudwatch.log_stream` | sort - station | head 500" |
There was a problem hiding this comment.
Same as comment above about spacing.
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Updated both |
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
LantaoJin
left a comment
There was a problem hiding this comment.
And this PR still miss some changes in opensearch-project/sql#4668:
terms_significant_1.ppl: correct the group-by list
sort_keyword_can_match_shortcut.ppl: correct the sort key
Can you revisit 4668 again to make sure all changes except comments are included.
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
| "method": "POST", | ||
| "body": { | ||
| "query": "source = {{index_name | default('big5')}} match(`process.name`, 'kernel') | where `@timestamp` >= '2023-01-01 00:00:00' and `@timestamp` < '2023-01-03 00:00:00' | head 10" | ||
| "query": "source = {{index_name | default('big5')}} {% if distribution_version.split('.') | map('int') | list >= '3.3.0'.split('.') | map('int') | list %}process.name=kernel{% else %}match(`process.name`, 'kernel'){% endif %} | where `@timestamp` >= '2023-01-01 00:00:00' and `@timestamp` < '2023-01-03 00:00:00' | head 10" |
There was a problem hiding this comment.
If I understand this correctly, if the version is equal or greater than 3.3.0 then use process.name, else use "`process.name`" ? Would this cause existing customer queries to break when they upgrade from lower versions to 3.3.0 and above or is it backward compatible?
There was a problem hiding this comment.
If I understand this correctly, if the version is equal or greater than 3.3.0 then use process.name, else use "
process.name" ? Would this cause existing customer queries to break when they upgrade from lower versions to 3.3.0 and above or is it backward compatible?
opensearch-project/sql#4152 corrected usage of match function
There was a problem hiding this comment.
AFAIK we are not supposed to have breaking changes in minor version updates.
@penghuo @peterzhuamazon @getsaurabh02
There was a problem hiding this comment.
Yes breaking changes is only in major release, especially API.
I think it is ok to make an API deprecated but still support until next major, but not ok to completely alter its functionalities to the point the original behavior is removed.
cc: @getsaurabh02 into the conversation here, and whether we should restore the original behavior, possibly in a patch or in 3.4.0.
Thanks.
There was a problem hiding this comment.
@peterzhuamazon @noCharger
I don’t consider this a breaking change. The reason is that using the match function within the search command is not by design. In PPL 3.2, the documentation only describes using match within the where clause.
| "method": "POST", | ||
| "body": { | ||
| "query": "source = {{index_name | default('big5')}} | stats count() as country by `aws.cloudwatch.log_stream` | sort - country | head 50" | ||
| "query": "source = {{index_name | default('big5')}} | stats {% if distribution_version.split('.') | map('int') | list >= '3.3.0'.split('.') | map('int') | list %}bucket_nullable = false {% endif %}count() as country by `aws.cloudwatch.log_stream` | sort - country | head 50" |
There was a problem hiding this comment.
is this a new param introduced in 3.3.0?
There was a problem hiding this comment.
yes, as mentioned in description section
| "method": "POST", | ||
| "body": { | ||
| "query": "source = {{index_name | default('big5')}} query_string(['message'], 'monkey jackal bear') | head 10" | ||
| "query": "source = {{index_name | default('big5')}} {% if distribution_version.split('.') | map('int') | list >= '3.3.0'.split('.') | map('int') | list %}| where query_string(['message'], 'monkey jackal bear'){% else %}query_string(['message'], 'monkey jackal bear'){% endif %} | head 10" |
There was a problem hiding this comment.
wold queries still work in 3.3.0 without the where clause?
There was a problem hiding this comment.
no, it's also introduced from opensearch-project/sql#4152
There was a problem hiding this comment.
It’s not a breaking change — we never intended to support complex WHERE clauses in the search command. As documented here and here only simple filters are supported.
The earlier support for complex conditions was unintentional, resulting from the use of logicalExpression in the search command. This has been corrected in the current release, giving us a clearer distinction and roadmap for both the commands.
@noCharger Could you please reword the description? This is essentially a rectification of the benchmark workload queries.
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-3 3
# Navigate to the new working tree
pushd ../.worktrees/backport-3
# Create a new branch
git switch --create backport/backport-708-to-3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ea2c75e92f68f0a8caf72233c89c442bd7d50028
# Push it to GitHub
git push --set-upstream origin backport/backport-708-to-3
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-3Then, create a pull request where the |
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2 2
# Navigate to the new working tree
pushd ../.worktrees/backport-2
# Create a new branch
git switch --create backport/backport-708-to-2
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ea2c75e92f68f0a8caf72233c89c442bd7d50028
# Push it to GitHub
git push --set-upstream origin backport/backport-708-to-2
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2Then, create a pull request where the |
* Update big5 ppl queries Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Fix json lint on s0 Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Update match() and query_string() for 3.3.0 above Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Update bin on timestamp only support on 3.4.0 Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Revised on Lantao's comments Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> --------- Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> (cherry picked from commit ea2c75e)
* Update big5 ppl queries Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Fix json lint on s0 Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Update match() and query_string() for 3.3.0 above Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Update bin on timestamp only support on 3.4.0 Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> * Revised on Lantao's comments Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> --------- Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> (cherry picked from commit ea2c75e)
* Update big5 ppl queries * Fix json lint on s0 * Update match() and query_string() for 3.3.0 above * Update bin on timestamp only support on 3.4.0 * Revised on Lantao's comments --------- (cherry picked from commit ea2c75e) Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> Co-authored-by: Louis Chu <lingzhichu.clz@gmail.com>
* Update big5 ppl queries * Fix json lint on s0 * Update match() and query_string() for 3.3.0 above * Update bin on timestamp only support on 3.4.0 * Revised on Lantao's comments --------- (cherry picked from commit ea2c75e) Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com> Co-authored-by: Louis Chu <lingzhichu.clz@gmail.com>
Description
Apply changes from opensearch-project/sql#4668 and opensearch-project/sql#4152:
statswill setbucket_nullable = falseto align with related DSL (since 3.3)auto-date-histogram,rangeandmulti-termsbucket aggregation. (since 3.4)4.multi_terms_keyword.ppl should add head 10
5. Syntax change on PPL >= 3.3.0 that
matchfunc removed andquery_stringsyntax must be in where, as a rectification of PPL queries.6. terms_significant_1 and terms_significant_2: correct the group-by list
7. sort_keyword*: correct the sort key
Test results
Backport to Branches:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.