Cleanup ES|QL T-Digest code duplication, add memory accounting#143662
Cleanup ES|QL T-Digest code duplication, add memory accounting#143662JonasKunz merged 15 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
.../plugin/core/src/main/java/org/elasticsearch/xpack/core/analytics/mapper/EncodedTDigest.java
Show resolved
Hide resolved
| blocks[offset] = blockFactory.newConstantTDigestBlock(resultHolder, 1); | ||
| blocks[offset + 1] = blockFactory.newConstantBooleanBlockWith(true, 1); | ||
| try (var tempHolder = BreakingTDigestHolder.create(breaker)) { | ||
| tempHolder.set(merger, sum, min, max); |
There was a problem hiding this comment.
I decided to remove it, because it is not needed and redundant. The count always is the sum of the counts of all centroids in the merged TDigest.
.../plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/BreakingTDigestHolder.java
Outdated
Show resolved
Hide resolved
|
No reason to include everything in a single PR. I'll get back to reviewing once you move |
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/TDigestArrayBlock.java
Show resolved
Hide resolved
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/TDigestHolder.java
Show resolved
Hide resolved
| return dateRangeToString(from, to); | ||
| }; | ||
| case TDIGEST -> (block, offset, scratch) -> ((TDigestBlock) block).getTDigestHolder(offset); | ||
| case TDIGEST -> (block, offset, scratch) -> ((TDigestBlock) block).getTDigestHolder(offset, new TDigestHolder()); |
There was a problem hiding this comment.
Maybe wrap this in a helper that only takes offset as arg.
There was a problem hiding this comment.
I'd prefer not to, as that would encourage the non-allocation free usage.
This method with the scratch parameter is a common pattern for block access, e.g. see BytesRefBlock:
https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/data/BytesRefBlock.java#L38
kkrik-es
left a comment
There was a problem hiding this comment.
Just a few minor comments, approving to get you going.
|
Important Review skippedAuto reviews are limited based on label configuration. 🏷️ Required labels (at least one) (2)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…locations * upstream/main: (153 commits) ES|QL: Update docs for TOP_SNIPPETS and DECAY (elastic#143739) Correctly include endpoint id in log msg in AuthorizationPoller (elastic#143743) Bar searching or sorting on _seq_no when disabled (elastic#143600) Generalize `testClientCancellation` test (elastic#143586) JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction (elastic#143702) Track recycler pages in circuit breaker (elastic#143738) [ESQL] Enable distributed pipeline breakers for external sources via FragmentExec (elastic#143696) Adding 'mode' and 'codec' fields to ES monitoring template (elastic#143673) [ESQL] Columnar I/O and vectorized block conversion for external sources (elastic#143703) Fix flaky MMR diversification YAML tests (elastic#143706) ES|QL codegen: check builder arguments for vector support (elastic#143724) Add Views Security Model (elastic#141050) ESQL: Prevent pushdown of unmapped fields in filters and sorts (elastic#143460) Don't run seq_no pruning tests in release CI (elastic#143725) ESQL: Support intra-row field references in ROW command (elastic#140217) ES|QL: Remove implicit limit in FORK branches in CSV tests (elastic#143601) IndexRoutingTests with and without synthetic id (elastic#143566) Synthetic id upgrade test in serverless (elastic#142471) Disable "Review skipped" comments for PRs without specified labels (elastic#143728) Cleanup ES|QL T-Digest code duplication, add memory accounting (elastic#143662) ...
byte[]encoding/decodingT-Digestsstored in blocks allocation free by requiring a scratch input parameter (similar to how it works forBytesRefBlocks)BreakingTDigestHolderwith full memory accounting: This follows the pattern ofBreakingBytesRefBuilderand will be used for the agg-state onfirst/last_over_timeTDigestStatewithTDigest:TDigestStateis just a decorator with serialization which don't use and don't want to use