Skip to content

Expose frozen indices information on the rule health endpoint#219703

Merged
denar50 merged 1 commit intomainfrom
security-team-12387-expose-frozen-indices-stats-on-rule-execution-summary-endpoint
May 8, 2025
Merged

Expose frozen indices information on the rule health endpoint#219703
denar50 merged 1 commit intomainfrom
security-team-12387-expose-frozen-indices-stats-on-rule-execution-summary-endpoint

Conversation

@denar50
Copy link
Contributor

@denar50 denar50 commented Apr 30, 2025

Summary

This is a follow up PR to expose the metric frozen_indices_queried_max_count on the rule healthcheck endpoint.
This metric is an aggregation of the metric frozen_indices_queried_count which is calculated upon rule execution. Refer to this PR to see more details about it.

How to test this?

  • Run Elastic locally with these additional parameters in order to enable the frozen data tier: -E path.repo="/tmp" -E xpack.searchable.snapshot.shared_cache.size=20GB.
  • Use this tutorial to create the snapshot repository and an ILM policy. You can disable rollover for the ILM policy and configure indices to be moved to frozen after 0 days.
  • Create an index manually and populate it with a couple of documents.
  • Assign the ILM policy to the index you created in the previous step and wait for it to be rolled to frozen. You can run this command to speed up the process:
PUT /_cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10s"
  }
}

You can confirm that the index is indeed in frozen by calling

GET <YOUR_IDX_HERE>/_ilm/explain

phase should be frozen and step should be complete.

  • Create a rule querying the frozen index.
  • Call the rule health endpoint with:
curl -X POST --user elastic:changeme "http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z" \
  -H "Content-Type: application/json" \
  -H "elastic-api-version: 1" \
  -H 'kbn-xsrf: 123' \
  -H "x-elastic-internal-origin: Kibana" \
  --data '{"rule_id":"2f9780b5-7819-4685-ab8e-d817d3701d10"}'

You should see frozen_indices_queried_max_count populated with 1.

@denar50 denar50 requested a review from a team as a code owner April 30, 2025 11:37
@denar50 denar50 requested a review from nikitaindik April 30, 2025 11:37
@denar50 denar50 added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Team:Detection Engine Security Solution Detection Engine Area labels Apr 30, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detection-engine (Team:Detection Engine)

@denar50 denar50 force-pushed the security-team-12387-expose-frozen-indices-stats-on-rule-execution-summary-endpoint branch from dcf4162 to 510cf09 Compare April 30, 2025 11:42
@denar50 denar50 added backport:version Backport to applied version labels v8.19.0 and removed backport:skip This PR does not require backporting labels Apr 30, 2025
@denar50 denar50 force-pushed the security-team-12387-expose-frozen-indices-stats-on-rule-execution-summary-endpoint branch 3 times, most recently from ed21961 to 6543310 Compare May 5, 2025 16:13
@jkelas jkelas requested review from jkelas and removed request for nikitaindik May 7, 2025 14:27
@jkelas
Copy link
Contributor

jkelas commented May 8, 2025

The code looks fine, but I discovered some issue when testing. I couldn't make the frozen_indices_queried_max_count display 1, it was always set to 0. I paired up with the author, @denar50 , and we worked together on the code, the author needs to get back to testing in his own environment, I passed him all the information / how I set up my environment. Waiting for an update from @denar50.

@denar50 denar50 force-pushed the security-team-12387-expose-frozen-indices-stats-on-rule-execution-summary-endpoint branch from 6543310 to f7610d9 Compare May 8, 2025 10:56
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

✅ unchanged

History

Copy link
Contributor

@jkelas jkelas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After pairing up again with the reviewer, we concluded that the reason why it didn't work the previous time was an incorrect time range for the query. After fixing the settings, the behavior is correct, it shows the "frozen_indices_queried_max_count": 1 as expected.

The code looks OK.

Testing was done according to the instruction in the ticket.
My curl command looked like this:

curl -X POST --user elastic:changeme "http://localhost:5621/kbn/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-09T09:08:39.489Z" \
  -H "Content-Type: application/json" \
  -H "elastic-api-version: 1" \
  -H 'kbn-xsrf: 123' \
  -H "x-elastic-internal-origin: Kibana" \
  --data '{"rule_id":"7ca6fb8f-3f5f-4fea-a577-72eccc8e001c"}' | jq

and at the end of the printout, in the last bucket, I can see this:

        "indexing_duration_ms": {
          "percentiles": {
            "50.0": 4,
            "95.0": 4,
            "99.0": 5.519999999999996,
            "99.9": 5.952000000000005
          }
        },
        "frozen_indices_queried_max_count": 1

@denar50 denar50 added the v9.1.0 label May 8, 2025
@denar50 denar50 merged commit 0544125 into main May 8, 2025
11 checks passed
@denar50 denar50 deleted the security-team-12387-expose-frozen-indices-stats-on-rule-execution-summary-endpoint branch May 8, 2025 15:11
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.19

https://github.com/elastic/kibana/actions/runs/14909813866

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.19 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 219703

Questions ?

Please refer to the Backport tool documentation

denar50 added a commit that referenced this pull request May 8, 2025
## Summary
This is a follow up PR to expose the metric
`frozen_indices_queried_max_count` on the rule healthcheck endpoint.
This metric is an aggregation of the metric
`frozen_indices_queried_count` which is calculated upon rule execution.
Refer to [this PR](#218435) to see
more details about it.

## How to test this?
- Run Elastic locally with these additional parameters in order to
enable the frozen data tier: -E path.repo="/tmp" -E
xpack.searchable.snapshot.shared_cache.size=20GB.
- Use [this
tutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)
to create the snapshot repository and an ILM policy. You can disable
rollover for the ILM policy and configure indices to be moved to frozen
after 0 days.
- Create an index manually and populate it with a couple of documents.
- Assign the ILM policy to the index you created in the previous step
and wait for it to be rolled to frozen. You can run this command to
speed up the process:
```
PUT /_cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10s"
  }
}
```
You can confirm that the index is indeed in frozen by calling
```
GET <YOUR_IDX_HERE>/_ilm/explain
```
`phase` should be `frozen` and `step` should be `complete`.
- Create a rule querying the frozen index.
- Call the rule health endpoint with:
```
curl -X POST --user elastic:changeme "http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z" \
  -H "Content-Type: application/json" \
  -H "elastic-api-version: 1" \
  -H 'kbn-xsrf: 123' \
  -H "x-elastic-internal-origin: Kibana" \
  --data '{"rule_id":"2f9780b5-7819-4685-ab8e-d817d3701d10"}'
```
You should see `frozen_indices_queried_max_count` populated with `1`.

(cherry picked from commit 0544125)

# Conflicts:
#	x-pack/solutions/security/plugins/security_solution/common/api/detection_engine/rule_monitoring/detection_engine_health/health_endpoints.md
#	x-pack/solutions/security/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/detection_engine_health/event_log/aggregations/types.ts
@denar50
Copy link
Contributor Author

denar50 commented May 8, 2025

💚 All backports created successfully

Status Branch Result
8.19

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

denar50 added a commit that referenced this pull request May 8, 2025
## Summary
This is a follow up PR to expose the metric
`frozen_indices_queried_max_count` on the rule healthcheck endpoint.
This metric is an aggregation of the metric
`frozen_indices_queried_count` which is calculated upon rule execution.
Refer to [this PR](#218435) to see
more details about it.

## How to test this?
- Run Elastic locally with these additional parameters in order to
enable the frozen data tier: -E path.repo="/tmp" -E
xpack.searchable.snapshot.shared_cache.size=20GB.
- Use [this
tutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)
to create the snapshot repository and an ILM policy. You can disable
rollover for the ILM policy and configure indices to be moved to frozen
after 0 days.
- Create an index manually and populate it with a couple of documents.
- Assign the ILM policy to the index you created in the previous step
and wait for it to be rolled to frozen. You can run this command to
speed up the process:
```
PUT /_cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10s"
  }
}
```
You can confirm that the index is indeed in frozen by calling
```
GET <YOUR_IDX_HERE>/_ilm/explain
```
`phase` should be `frozen` and `step` should be `complete`.
- Create a rule querying the frozen index.
- Call the rule health endpoint with:
```
curl -X POST --user elastic:changeme "http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z" \
  -H "Content-Type: application/json" \
  -H "elastic-api-version: 1" \
  -H 'kbn-xsrf: 123' \
  -H "x-elastic-internal-origin: Kibana" \
  --data '{"rule_id":"2f9780b5-7819-4685-ab8e-d817d3701d10"}'
```
You should see `frozen_indices_queried_max_count` populated with `1`.

(cherry picked from commit 0544125)

# Conflicts:
#	x-pack/solutions/security/plugins/security_solution/common/api/detection_engine/rule_monitoring/detection_engine_health/health_endpoints.md
#	x-pack/solutions/security/plugins/security_solution/server/lib/detection_engine/rule_monitoring/logic/detection_engine_health/event_log/aggregations/types.ts
denar50 added a commit that referenced this pull request May 9, 2025
…219703) (#220540)

# Backport

This will backport the following commits from `main` to `8.19`:
- [Expose frozen indices information on the rule health endpoint
(#219703)](#219703)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Edgar
Santos","email":"edgar.santos@elastic.co"},"sourceCommit":{"committedDate":"2025-05-08T15:11:38Z","message":"Expose
frozen indices information on the rule health endpoint (#219703)\n\n##
Summary\nThis is a follow up PR to expose the
metric\n`frozen_indices_queried_max_count` on the rule healthcheck
endpoint.\nThis metric is an aggregation of the
metric\n`frozen_indices_queried_count` which is calculated upon rule
execution.\nRefer to [this
PR](#218435) to see\nmore details
about it.\n\n## How to test this?\n- Run Elastic locally with these
additional parameters in order to\nenable the frozen data tier: -E
path.repo=\"/tmp\"
-E\nxpack.searchable.snapshot.shared_cache.size=20GB.\n- Use
[this\ntutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)\nto
create the snapshot repository and an ILM policy. You can
disable\nrollover for the ILM policy and configure indices to be moved
to frozen\nafter 0 days.\n- Create an index manually and populate it
with a couple of documents.\n- Assign the ILM policy to the index you
created in the previous step\nand wait for it to be rolled to frozen.
You can run this command to\nspeed up the process:\n```\nPUT
/_cluster/settings\n{\n \"persistent\": {\n
\"indices.lifecycle.poll_interval\": \"10s\"\n }\n}\n```\nYou can
confirm that the index is indeed in frozen by calling\n```\nGET
<YOUR_IDX_HERE>/_ilm/explain\n```\n`phase` should be `frozen` and `step`
should be `complete`.\n- Create a rule querying the frozen index.\n-
Call the rule health endpoint with:\n```\ncurl -X POST --user
elastic:changeme
\"http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z\"
\\\n -H \"Content-Type: application/json\" \\\n -H
\"elastic-api-version: 1\" \\\n -H 'kbn-xsrf: 123' \\\n -H
\"x-elastic-internal-origin: Kibana\" \\\n --data
'{\"rule_id\":\"2f9780b5-7819-4685-ab8e-d817d3701d10\"}'\n```\nYou
should see `frozen_indices_queried_max_count` populated with
`1`.","sha":"054412570946ed0a2056ff5259388e9df08d7d37","branchLabelMapping":{"^v9.1.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Detection
Engine","backport:version","v9.1.0","v8.19.0"],"title":"Expose frozen
indices information on the rule health
endpoint","number":219703,"url":"https://github.com/elastic/kibana/pull/219703","mergeCommit":{"message":"Expose
frozen indices information on the rule health endpoint (#219703)\n\n##
Summary\nThis is a follow up PR to expose the
metric\n`frozen_indices_queried_max_count` on the rule healthcheck
endpoint.\nThis metric is an aggregation of the
metric\n`frozen_indices_queried_count` which is calculated upon rule
execution.\nRefer to [this
PR](#218435) to see\nmore details
about it.\n\n## How to test this?\n- Run Elastic locally with these
additional parameters in order to\nenable the frozen data tier: -E
path.repo=\"/tmp\"
-E\nxpack.searchable.snapshot.shared_cache.size=20GB.\n- Use
[this\ntutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)\nto
create the snapshot repository and an ILM policy. You can
disable\nrollover for the ILM policy and configure indices to be moved
to frozen\nafter 0 days.\n- Create an index manually and populate it
with a couple of documents.\n- Assign the ILM policy to the index you
created in the previous step\nand wait for it to be rolled to frozen.
You can run this command to\nspeed up the process:\n```\nPUT
/_cluster/settings\n{\n \"persistent\": {\n
\"indices.lifecycle.poll_interval\": \"10s\"\n }\n}\n```\nYou can
confirm that the index is indeed in frozen by calling\n```\nGET
<YOUR_IDX_HERE>/_ilm/explain\n```\n`phase` should be `frozen` and `step`
should be `complete`.\n- Create a rule querying the frozen index.\n-
Call the rule health endpoint with:\n```\ncurl -X POST --user
elastic:changeme
\"http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z\"
\\\n -H \"Content-Type: application/json\" \\\n -H
\"elastic-api-version: 1\" \\\n -H 'kbn-xsrf: 123' \\\n -H
\"x-elastic-internal-origin: Kibana\" \\\n --data
'{\"rule_id\":\"2f9780b5-7819-4685-ab8e-d817d3701d10\"}'\n```\nYou
should see `frozen_indices_queried_max_count` populated with
`1`.","sha":"054412570946ed0a2056ff5259388e9df08d7d37"}},"sourceBranch":"main","suggestedTargetBranches":["8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/219703","number":219703,"mergeCommit":{"message":"Expose
frozen indices information on the rule health endpoint (#219703)\n\n##
Summary\nThis is a follow up PR to expose the
metric\n`frozen_indices_queried_max_count` on the rule healthcheck
endpoint.\nThis metric is an aggregation of the
metric\n`frozen_indices_queried_count` which is calculated upon rule
execution.\nRefer to [this
PR](#218435) to see\nmore details
about it.\n\n## How to test this?\n- Run Elastic locally with these
additional parameters in order to\nenable the frozen data tier: -E
path.repo=\"/tmp\"
-E\nxpack.searchable.snapshot.shared_cache.size=20GB.\n- Use
[this\ntutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)\nto
create the snapshot repository and an ILM policy. You can
disable\nrollover for the ILM policy and configure indices to be moved
to frozen\nafter 0 days.\n- Create an index manually and populate it
with a couple of documents.\n- Assign the ILM policy to the index you
created in the previous step\nand wait for it to be rolled to frozen.
You can run this command to\nspeed up the process:\n```\nPUT
/_cluster/settings\n{\n \"persistent\": {\n
\"indices.lifecycle.poll_interval\": \"10s\"\n }\n}\n```\nYou can
confirm that the index is indeed in frozen by calling\n```\nGET
<YOUR_IDX_HERE>/_ilm/explain\n```\n`phase` should be `frozen` and `step`
should be `complete`.\n- Create a rule querying the frozen index.\n-
Call the rule health endpoint with:\n```\ncurl -X POST --user
elastic:changeme
\"http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z\"
\\\n -H \"Content-Type: application/json\" \\\n -H
\"elastic-api-version: 1\" \\\n -H 'kbn-xsrf: 123' \\\n -H
\"x-elastic-internal-origin: Kibana\" \\\n --data
'{\"rule_id\":\"2f9780b5-7819-4685-ab8e-d817d3701d10\"}'\n```\nYou
should see `frozen_indices_queried_max_count` populated with
`1`.","sha":"054412570946ed0a2056ff5259388e9df08d7d37"}},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
akowalska622 pushed a commit to akowalska622/kibana that referenced this pull request May 29, 2025
…c#219703)

## Summary
This is a follow up PR to expose the metric
`frozen_indices_queried_max_count` on the rule healthcheck endpoint.
This metric is an aggregation of the metric
`frozen_indices_queried_count` which is calculated upon rule execution.
Refer to [this PR](elastic#218435) to see
more details about it.

## How to test this?
- Run Elastic locally with these additional parameters in order to
enable the frozen data tier: -E path.repo="/tmp" -E
xpack.searchable.snapshot.shared_cache.size=20GB.
- Use [this
tutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)
to create the snapshot repository and an ILM policy. You can disable
rollover for the ILM policy and configure indices to be moved to frozen
after 0 days.
- Create an index manually and populate it with a couple of documents.
- Assign the ILM policy to the index you created in the previous step
and wait for it to be rolled to frozen. You can run this command to
speed up the process:
```
PUT /_cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10s"
  }
}
```
You can confirm that the index is indeed in frozen by calling
```
GET <YOUR_IDX_HERE>/_ilm/explain
```
`phase` should be `frozen` and `step` should be `complete`.
- Create a rule querying the frozen index.
- Call the rule health endpoint with:
```
curl -X POST --user elastic:changeme "http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z" \
  -H "Content-Type: application/json" \
  -H "elastic-api-version: 1" \
  -H 'kbn-xsrf: 123' \
  -H "x-elastic-internal-origin: Kibana" \
  --data '{"rule_id":"2f9780b5-7819-4685-ab8e-d817d3701d10"}'
```
You should see `frozen_indices_queried_max_count` populated with `1`.
qn895 pushed a commit to qn895/kibana that referenced this pull request Jun 3, 2025
…c#219703)

## Summary
This is a follow up PR to expose the metric
`frozen_indices_queried_max_count` on the rule healthcheck endpoint.
This metric is an aggregation of the metric
`frozen_indices_queried_count` which is calculated upon rule execution.
Refer to [this PR](elastic#218435) to see
more details about it.

## How to test this?
- Run Elastic locally with these additional parameters in order to
enable the frozen data tier: -E path.repo="/tmp" -E
xpack.searchable.snapshot.shared_cache.size=20GB.
- Use [this
tutorial](https://docs.elastic.dev/security-soution/analyst-experience-team/eng-prod/how-to/configure-local-frozen-tier)
to create the snapshot repository and an ILM policy. You can disable
rollover for the ILM policy and configure indices to be moved to frozen
after 0 days.
- Create an index manually and populate it with a couple of documents.
- Assign the ILM policy to the index you created in the previous step
and wait for it to be rolled to frozen. You can run this command to
speed up the process:
```
PUT /_cluster/settings
{
  "persistent": {
    "indices.lifecycle.poll_interval": "10s"
  }
}
```
You can confirm that the index is indeed in frozen by calling
```
GET <YOUR_IDX_HERE>/_ilm/explain
```
`phase` should be `frozen` and `step` should be `complete`.
- Create a rule querying the frozen index.
- Call the rule health endpoint with:
```
curl -X POST --user elastic:changeme "http://localhost:5601/internal/detection_engine/health/_rule?date_start=2025-04-29T09:07:39.489Z&date_end=2025-05-01T09:08:39.489Z" \
  -H "Content-Type: application/json" \
  -H "elastic-api-version: 1" \
  -H 'kbn-xsrf: 123' \
  -H "x-elastic-internal-origin: Kibana" \
  --data '{"rule_id":"2f9780b5-7819-4685-ab8e-d817d3701d10"}'
```
You should see `frozen_indices_queried_max_count` populated with `1`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes Team:Detection Engine Security Solution Detection Engine Area v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants