-
Notifications
You must be signed in to change notification settings - Fork 25.8k
T digest field type docs #140478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T digest field type docs #140478
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
|
Pinging @elastic/core-docs (Team:Docs) |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
Conflicts: docs/reference/query-languages/esql/limitations.md
| In general, the higher this number, the more space on disk the field will use | ||
| but the more accurate the sketch approximations will be. Default is `100` | ||
| * `digest_type`, which selects the merge strategy to use with the sketch. Valid | ||
| values are `hybrid`, `avl_tree`, and `merging`. The default is `hybrid`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this, we shouldn't expose such details. I'd say we hide it entirely, and maybe add something like [execution_hint: high_accuracy] if there's an ask for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #140736 to address this. For now, I'll just pull this bit out of the docs so we can merge it, and I'll update them when I change the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is addressed in this PR.
| indexed. | ||
|
|
||
| `tdigest` fields are supported in ES|QL; see the | ||
| [ES|QL reference](/reference/query-languages/esql.md) for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note here about notable aggregation functions, i.e. percentile, min, max, sum and count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| } | ||
| ``` | ||
|
|
||
| Indexing a minimal document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: add newline after this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few wording and structural suggestions, along with a few nits :)
Let me know if anything is unclear, I'll defer to @kkrik-es for the final approval, because this smells a lot more like math than I'm comfortable with 😉
[sorry some of the comments are out of order, so the changes aren't 100% linear]
| will compute them based on the given `centroids` and `counts`, with some loss of | ||
| accuracy. | ||
|
|
||
| ::::{important} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this to ## Limitations subheading instead of an admonition like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that if you want, but I'm just copying how it is in the Exponential Histogram Field Docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's easier to scan and find if we use a subheading, but it's not a huge deal :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe update both pages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, updated both.
docs/reference/elasticsearch/mapping-reference/field-data-types.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
…d-type-docs' into t-digest-field-type-docs
…s.md Co-authored-by: Liam Thompson <leemthompo@gmail.com>
…d-type-docs' into t-digest-field-type-docs
|
Hi @not-napoleon, I've created a changelog YAML for you. |
|
I folded the change for the digest type parameter into this PR, and marked it as an enhancement as result. |
…d-type-docs' into t-digest-field-type-docs
…d-type-docs' into t-digest-field-type-docs
| In general, the higher this number, the more space on disk the field will use | ||
| but the more accurate the sketch approximations will be. Default is `100` | ||
| * `digest_type`, which selects the merge strategy to use with the sketch. Valid | ||
| values are `default` and `high_accuracy`. The default is `default`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain that the high_accuracy one is slower and allocates more memory? While the default is optimized for merging speed and constant memory footprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
…d-type-docs' into t-digest-field-type-docs
💔 Backport failed
You can use sqren/backport to manually backport by running |
Adds documentation for the T-Digest field type, and makes a small change to the Exponential Histogram field type docs for consistency. While working on this, we decided the initial way of specifying the digest type for the t-digest field was overly complex, and moved to a simpler set of choices. That code change is also included in this mostly docs update. --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Conflicts: docs/reference/elasticsearch/mapping-reference/exponential-histogram.md docs/reference/elasticsearch/toc.yml docs/reference/query-languages/esql/limitations.md
Adds documentation for the T-Digest field type, and makes a small change to the Exponential Histogram field type docs for consistency. While working on this, we decided the initial way of specifying the digest type for the t-digest field was overly complex, and moved to a simpler set of choices. That code change is also included in this mostly docs update. --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Conflicts: docs/reference/elasticsearch/mapping-reference/exponential-histogram.md docs/reference/elasticsearch/toc.yml docs/reference/query-languages/esql/limitations.md
Adds documentation for the T-Digest field type, and makes a small change to the Exponential Histogram field type docs for consistency. While working on this, we decided the initial way of specifying the digest type for the t-digest field was overly complex, and moved to a simpler set of choices. That code change is also included in this mostly docs update. --------- Co-authored-by: Liam Thompson <leemthompo@gmail.com> Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
This adds documentation for the new t-digest field type.