-
Notifications
You must be signed in to change notification settings - Fork 0
Add datetime aggregation fixes #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as spam.
This comment was marked as spam.
| case TIME: | ||
| return MILLIS.between(LocalTime.MIN, expr.timeValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workaround for sesstion context issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a TODO to fix this
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/Utils.java
Show resolved
Hide resolved
opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/Utils.java
Show resolved
Hide resolved
| .shouldMatch(new ExprTimestampValue("1984-03-17 22:16:42") | ||
| .timestampValue().toEpochMilli()); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add the all-nulls case here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to add ITs that cover too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing UT framework is very limited and can't properly cover such case. There is a test for that and it didn't detect such case:
Lines 102 to 108 in 3eba57e
| @Test | |
| void can_execute_expression_with_null_field() { | |
| assertThat() | |
| .docValues("age", null) | |
| .evaluate(ref("age", INTEGER)) | |
| .shouldMatch(null); | |
| } |
| case TIME: | ||
| state.total += MILLIS.between(LocalTime.MIN, value.timeValue()); | ||
| break; | ||
| case DOUBLE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot compute AVG for SHORT, INTEGER, FLOAT, etc...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These types are casted to DOUBLE automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the automatic casting of types to double be tested in AvgAggregatorTest? I think only INTEGER and DOUBLE are only in there for number values.
core/src/test/java/org/opensearch/sql/expression/aggregation/AvgAggregatorTest.java
Outdated
Show resolved
Hide resolved
| public static long extractEpochMilliFromAnyDateTimeType(ExprValue value) { | ||
| switch ((ExprCoreType)value.type()) { | ||
| case DATE: | ||
| return MILLIS.between(LocalDateTime.of(1970, 1, 1, 0, 0), value.datetimeValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but it seems like we could use the Instant.toEpochMilli for this call too. Something like:
value.datetimeValue().toInstant().toEpochMilli();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
Initially case DATE: fallen back with TIMESTAMP and DATETIME, conversion thru timestamp (Instant) returned wrong result, because timestamp changes TZ to UTC.
DATE -> MILLIS -> DATE -> MILLIS check failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. While thinking on your comment I found a mistake. Fixed in 478141e.
Nice catch! It is fixed in #145 (not merged yet on upstream - opensearch-project#1000). |
I see. Thanks. LGTM! |
I think we should add integration tests for these cases once the PR upstream is merged. |
db98434 to
435e940
Compare
* Add aggregator fixes and some useless unit tests. * Add `avg` aggregation on datetime types. * Rework in-memory `AVG`. Fix parsing value returned from the OpenSearch node. Signed-off-by: Yury-Fridlyand <[email protected]>
435e940 to
31a72d4
Compare
Done! Thank you for the idea. |
forestmvey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
MitchellGale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation looks good.


Signed-off-by: Yury-Fridlyand [email protected]
Description
Make listed aggregations work with datetime types.
Fixes
minmaxavgNot included
var_sampvar_popstddev_sampstddev_popsum[1]Implementation details:
Presicion
Aggregation done with milliseconds precision - this follows OpenSearch approach. See code snippets: one, two.
We convert datetimes to millis and back on our own (see below), so we can scale up to nanoseconds. Such fix requires additional changes - few tests fail.
ExpressionAggregationScript::executeA callback function, called by OpenSearch node to extract a value during processing aggregation script.
This added for backward compatibility:
opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/ExpressionAggregationScript.java
Lines 53 to 55 in 272bf67
This - to extract value. Fortunately,
toEpochMilli()returns negative values for pre-Epoch timestamps, so we are able to use it for group/compare any values.opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/ExpressionAggregationScript.java
Lines 56 to 67 in 272bf67
OpenSearch accepts dates in
Jodalib types and converts to milliseconds since Epoch. We have java datetime types, to I do conversion there.https://github.com/opensearch-project/OpenSearch/blob/140e8d3e6c91519edc47be07b4cd053fdfac1769/server/src/main/java/org/opensearch/search/aggregations/support/values/ScriptDoubleValues.java#L123
OpenSearchExprValueFactoryparseTimestampandconstructTimestampOpenSearch returns aggregated values as strings, even milliseconds since Epoch. I have to extract it, so I combined these functions together.
opensearch-project-sql/opensearch/src/main/java/org/opensearch/sql/opensearch/data/value/OpenSearchExprValueFactory.java
Lines 178 to 198 in 975317c
ExprValueUtilsThese two methods used by
avgin-memory aggregation for datetime types.opensearch-project-sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java
Line 190 in 272bf67
opensearch-project-sql/core/src/main/java/org/opensearch/sql/data/model/ExprValueUtils.java
Line 213 in 272bf67
AvgAggregatorAvgStatewas renamed toDoubleAvgStateDateTimeAvgStatedoes the same logic, but for datetime types.opensearch-project-sql/core/src/main/java/org/opensearch/sql/expression/aggregation/AvgAggregator.java
Lines 105 to 117 in 272bf67
AvgStateAggregatorFunctionNew signature were added.
opensearch-project-sql/core/src/main/java/org/opensearch/sql/expression/aggregation/AggregatorFunction.java
Lines 69 to 76 in 272bf67
Actually, plugin was able to do push down aggregation on datetime types, but this weren't accepted.
[1] It works on MySQL, but I don't see any reason to implement this for datetime types.
Test queries
In-memory aggregation
opensearch-project-sql/integ-test/src/test/java/org/opensearch/sql/sql/AggregationIT.java
Lines 196 to 199 in 17e1ae9
(cast required to ensure that you're working with
TIME).To try other types, use
Push Down aggregation (
metric)Push Down aggregation (
composite_buckets)Issues Resolved
opensearch-project#645
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.