Expose range values of other fields to coordinating nodes #81457

dnhatn · 2021-12-07T16:50:25Z

Today, we use the range values of the @timestamp field exposed to coordinating nodes (via the cluster states) to skip shards that won't match search queries. This is important for searchable snapshots and frozen indices. There's a need to expose the range values of other fields when users use them (and with @timestamp) to filter data. Another optimization is to track the range values of actively indexing shards (see #78776 (comment)) so we can efficiently skip those shards in the can_match phase and avoid trigger refreshing on them.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-12-07T16:50:28Z

Pinging @elastic/es-search (Team:Search)

javanna · 2022-04-20T14:39:27Z

I am a bit nervous about allowing to store range values of arbitrary fields in the cluster state. Especially as the main reason would be to shortcut queries that don't filter on @timestamp. Even if the coordinating node is not able to determine whether a shard can be skipped or not, the can_match phase on the shard will be lightweight, hence hitting the frozen tier should not necessarily be seen as a problem.

a03nikki · 2022-07-28T20:33:06Z

Could we at least add the event.ingested ECS date field too?

Especially considering our security documentation recommends and the prebuilt detection rules often use it.

From Configure advanced rule settings (optional):

k. Timestamp override (optional): Select a source event timestamp field. When selected, the rule’s query uses the selected field, instead of the default @timestamp field, to search for alerts. This can help reduce missing alerts due to network or server outages. Specifically, if your ingest pipeline adds a timestamp when events are sent to Elasticsearch, this avoids missing alerts due to ingestion delays.

TIP: These Filebeat modules have an event.ingested timestamp field that can be used instead of the default @timestamp field: Microsoft and Google Workspace.

From Troubleshoot ingestion pipeline delay:

You can reduce the number of missed alerts due to ingestion pipeline delay by specifying the Timestamp override field value to event.ingested in advanced settings during rule creation or editing. The detection engine uses the value from the event.ingested field as the timestamp when executing the rule.

javanna · 2022-10-13T16:22:11Z

We discussed this with the team. We said that while the optimization works well for @timestamp, it relies on how the data is indexed in the shards, hence we would not see this work well for any numeric field. Also, like mentioned above we would not want to allow any custom index metadata added to the cluster state. Moreover, the optimization is currently only enabled for read-only indices and expanding it to write indices would be complex (how would you update the metadata when a shard gets a new document written that updates the range?).

More importantly, there is common agreement that it should not be required to skip shards on the coordinating node at all times. It's a nice optimization when querying by timestamp, but the can match phase that happens on the shards is supposed to be fast and efficient. Hitting a shard on the frozen tier is not necessarily a bug, and we should investigate deeper when this causes problems to see what the real issue is. That said, we don't see value in adding this feature, as its cost would outweigh the benefits. We will reconsider if we collect evidence that outlines why hitting shards on the frozen tier should be avoided at all cost.

dnhatn added >enhancement :Search/Search Search-related issues that do not fall into other categories labels Dec 7, 2021

elasticmachine added the Team:Search Meta label for search team label Dec 7, 2021

rylnd mentioned this issue Dec 7, 2021

[Security Solution][Alerts] Warn if rules are querying indices without efficient range configuration elastic/kibana#120687

Open

2 tasks

javanna added the team-discuss label Apr 20, 2022

javanna removed the team-discuss label Oct 13, 2022

javanna closed this as not planned Won't fix, can't repro, duplicate, stale Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose range values of other fields to coordinating nodes #81457

Expose range values of other fields to coordinating nodes #81457

dnhatn commented Dec 7, 2021

elasticmachine commented Dec 7, 2021

javanna commented Apr 20, 2022

a03nikki commented Jul 28, 2022 •

edited

Loading

javanna commented Oct 13, 2022

Expose range values of other fields to coordinating nodes #81457

Expose range values of other fields to coordinating nodes #81457

Comments

dnhatn commented Dec 7, 2021

elasticmachine commented Dec 7, 2021

javanna commented Apr 20, 2022

a03nikki commented Jul 28, 2022 • edited Loading

javanna commented Oct 13, 2022

a03nikki commented Jul 28, 2022 •

edited

Loading