Skip to content

[Alerting] Provision UIAM APIKeys for alerting rules#254211

Merged
ersin-erdal merged 63 commits intoelastic:mainfrom
ersin-erdal:507-uiam-migration
Mar 18, 2026
Merged

[Alerting] Provision UIAM APIKeys for alerting rules#254211
ersin-erdal merged 63 commits intoelastic:mainfrom
ersin-erdal:507-uiam-migration

Conversation

@ersin-erdal
Copy link
Copy Markdown
Contributor

@ersin-erdal ersin-erdal commented Feb 20, 2026

This PR adds a background task that migrates alerting rules from legacy API keys to UIAM (Unified Identity and Access Management) API keys on Serverless. The task runs only when the alerting.rules.provisionUiamApiKeys feature flag is enabled and only in Serverless mode.

What's in scope

  • New task alerting:api_key_provisioning

    • Runs every 1 minute (configurable interval).
    • Timeout: 5 minutes.
    • Only registered and started when isServerless is true.
    • Scheduled when alerting.rules.provisionUiamApiKeys is enabled; removed when the flag is disabled.
  • Provisioning flow

    1. Find rules that have apiKey but no uiamApiKey, and where the key was not user-created (apiKeyCreatedByUser !== true).
    2. Exclude rules that already have a provisioning status of completed or skipped (via new saved object type).
    3. Call the core UIAM convert API to convert legacy API keys to UIAM format.
    4. Bulk-update rules with the new uiamApiKey.
    5. Write provisioning status per rule (completed / failed / skipped). THe SO is in the Alerting/Cases scope (Maybe should be independent since we want to use it for TM provisioning too)
    6. If a rule update fails after a successful conversion, the newly created UIAM API key is marked for invalidation (orphan handling).
  • New saved object type
    uiam-api-keys-provisioning-status — one document per rule with entityId, entityType, status (completed | failed | skipped), optional message, and @timestamp. Used to avoid re-processing and to report status.

  • Skipped rules
    Rules are skipped (and status recorded) when:

    • No apiKey, or
    • Already have uiamApiKey, or
    • apiKeyCreatedByUser === true.
  • Batching
    Rules are read in batches of 300. If more rules remain, the task reschedules itself 1 minute later.

Out of scope / behavior

  • Non-Serverless: Task is not registered; no behavior change.
  • Flag off: Task is not scheduled; no conversion runs.
  • License: If the UIAM convert API is unavailable (e.g. license), the task throws and Task Manager will retry.

To verify:

  1. Add below configs to your serverless.dev.yml
xpack.alerting.rules.apiKeyType: 'es'
xpack.alerting.invalidateApiKeysTask.removalDelay: '10s'
xpack.alerting.invalidateApiKeysTask.interval: '1m'
feature_flags.overrides:
  alerting.rules.provisionUiamApiKeys: false

so we can create some rules with the ES API Keys.

  1. Run Kibana and Elasticsearch with the below commands:
    yarn es serverless --projectType observability
    yarn start --serverless oblt --run-examples

  2. Login with the system_indices_superuser user.

  3. Create 4-5 rules and let them run at least for one time.

  4. Stop your ES and Kibana

  5. Update the below FF in your serverless.yml

feature_flags.overrides:
  alerting.rules.provisionUiamApiKeys: true
  1. Update GET_RULES_BATCH_SIZE to 2 in uiam_api_key_provisioning_task.ts
  2. Run Kibana and Elasticsearch with the below commands:
    yarn es serverless --projectType observability --uiam
    yarn start --serverless oblt --run-examples --uiam
  3. The task should run and update the first 2 rules with uiamApiKey
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "alert"
      }
    }
  }
}
  1. Wait for 1m, the task should run again and update the next 2 rules.
  2. You can observe the provisioning status with:
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "uiam_api_keys_provisioning_status"
      }
    }
  }
}
  1. If your rules run often (every 1m) Rule update may fail. In this case we set the provisioning status to failed and fetch the rule in the next batch again. We also set the orphaned converted apiKeys to be invalidated.
    You can see them by using the below query:
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "api_key_pending_invalidation"
      }
    }
  }
}

@ersin-erdal ersin-erdal changed the title 507 uiam migration Provision UIAM APIKeys for alerting rules Feb 20, 2026
@ersin-erdal ersin-erdal changed the title Provision UIAM APIKeys for alerting rules [Alerting] Provision UIAM APIKeys for alerting rules Feb 25, 2026
# Conflicts:
#	src/core/packages/saved-objects/server-internal/src/object_types/index.ts
#	x-pack/platform/plugins/shared/alerting/server/plugin.ts
@ersin-erdal ersin-erdal marked this pull request as ready for review March 10, 2026 21:37
@ersin-erdal ersin-erdal requested review from a team as code owners March 10, 2026 21:37
@ersin-erdal ersin-erdal requested a review from azasypkin March 10, 2026 21:37
@ersin-erdal ersin-erdal added release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// labels Mar 10, 2026
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@ersin-erdal ersin-erdal requested a review from darnautov March 11, 2026 16:49
Copy link
Copy Markdown
Contributor

@azasypkin azasypkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in x-pack/platform/plugins/shared/alerting/server/saved_objects/index.ts LGTM: a newly defined uiam_api_keys_provisioning_status SO type doesn't have any encrypted fields.

Copy link
Copy Markdown
Contributor

@darnautov darnautov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +112 to +116
core.featureFlags
.getBooleanValue$(PROVISION_UIAM_API_KEYS_FLAG, false)
.subscribe((enabled: boolean) => {
this.applyProvisioningFlag(enabled, taskManager).catch(() => {});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to keep the reference to this subscription, and unsubscribe on plugin stop

const keys = apiKeysToConvert.map(({ attributes }) => attributes.apiKey!);
const convertResponse = await context.uiamConvert(keys);
if (convertResponse === null) {
throw new Error('License required for the UIAM convert API is not enabled');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it happen with a missing license only?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes uiam.convert returns null only when license is missing

page += 1;
}
if (ruleIds.size === 0) return undefined;
return nodeTypes.function.buildNode('not', convertRuleIdsToKueryNode(Array.from(ruleIds)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unlikely we hit the clause limit, buy it's worth adding a safeguard here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Claude recommended me to use chunks there.
But I don't think we would hit the limit.
Default limit is 4096 and the project has the most rules has 3600 rules.

@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Mar 17, 2026

⏳ Build in-progress, with failures

Failed CI Steps

History

@ersin-erdal ersin-erdal merged commit af7972c into elastic:main Mar 18, 2026
18 checks passed
jbudz added a commit that referenced this pull request Mar 18, 2026
Merge timing conflict between
#257678 and
#254211
szwarckonrad pushed a commit to szwarckonrad/kibana that referenced this pull request Mar 18, 2026
qn895 pushed a commit to qn895/kibana that referenced this pull request Mar 18, 2026
This PR adds a background task that migrates alerting rules from legacy
API keys to UIAM (Unified Identity and Access Management) API keys on
Serverless. The task runs only when the
`alerting.rules.provisionUiamApiKeys` feature flag is enabled and only
in Serverless mode.

### What's in scope

- **New task** `alerting:api_key_provisioning`
  - Runs every 1 minute (configurable interval).
  - Timeout: 5 minutes.
  - Only registered and started when `isServerless` is true.
- Scheduled when `alerting.rules.provisionUiamApiKeys` is enabled;
removed when the flag is disabled.

- **Provisioning flow**
1. Find rules that have `apiKey` but no `uiamApiKey`, and where the key
was not user-created (`apiKeyCreatedByUser !== true`).
2. Exclude rules that already have a provisioning status of `completed`
or `skipped` (via new saved object type).
3. Call the core UIAM convert API to convert legacy API keys to UIAM
format.
  4. Bulk-update rules with the new `uiamApiKey`.
5. Write provisioning status per rule (completed / failed / skipped).
THe SO is in the Alerting/Cases scope (Maybe should be independent since
we want to use it for TM provisioning too)
6. If a rule update fails after a successful conversion, the newly
created UIAM API key is marked for invalidation (orphan handling).

- **New saved object type**
`uiam-api-keys-provisioning-status` — one document per rule with
`entityId`, `entityType`, `status` (completed | failed | skipped),
optional `message`, and `@timestamp`. Used to avoid re-processing and to
report status.

- **Skipped rules**
  Rules are skipped (and status recorded) when:
  - No `apiKey`, or
  - Already have `uiamApiKey`, or
  - `apiKeyCreatedByUser === true`.

- **Batching**
Rules are read in batches of 300. If more rules remain, the task
reschedules itself 1 minute later.

### Out of scope / behavior

- **Non-Serverless**: Task is not registered; no behavior change.
- **Flag off**: Task is not scheduled; no conversion runs.
- **License**: If the UIAM convert API is unavailable (e.g. license),
the task throws and Task Manager will retry.

## To verify:

1. Add below configs to your serverless.dev.yml
```
xpack.alerting.rules.apiKeyType: 'es'
xpack.alerting.invalidateApiKeysTask.removalDelay: '10s'
xpack.alerting.invalidateApiKeysTask.interval: '1m'
feature_flags.overrides:
  alerting.rules.provisionUiamApiKeys: false
``` 
so we can create some rules with the ES API Keys.

2. Run Kibana and Elasticsearch with the below commands:
`yarn es serverless --projectType observability`
`yarn start --serverless oblt --run-examples`

3. Login with the `system_indices_superuser` user.
4. Create 4-5 rules and let them run at least for one time.
5. Stop your ES and Kibana
6. Update the below FF in your serverless.yml
```
feature_flags.overrides:
  alerting.rules.provisionUiamApiKeys: true
``` 
7. Update `GET_RULES_BATCH_SIZE` to `2` in
`uiam_api_key_provisioning_task.ts`
8. Run Kibana and Elasticsearch with the below commands:
`yarn es serverless --projectType observability --uiam`
`yarn start --serverless oblt --run-examples --uiam`
9. The task should run and update the first 2 rules with `uiamApiKey` 
```
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "alert"
      }
    }
  }
}
```
10. Wait for 1m, the task should run again and update the next 2 rules.
11. You can observe the provisioning status with:
```
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "uiam_api_keys_provisioning_status"
      }
    }
  }
}
```
12. If your rules run often (every 1m) Rule update may fail. In this
case we set the provisioning status to failed and fetch the rule in the
next batch again. We also set the orphaned converted apiKeys to be
invalidated.
You can see them by using the below query:
```
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "api_key_pending_invalidation"
      }
    }
  }
}
```

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Tyler Smalley <tyler.smalley@elastic.co>
qn895 pushed a commit to qn895/kibana that referenced this pull request Mar 18, 2026
jeramysoucy pushed a commit to jeramysoucy/kibana that referenced this pull request Mar 26, 2026
This PR adds a background task that migrates alerting rules from legacy
API keys to UIAM (Unified Identity and Access Management) API keys on
Serverless. The task runs only when the
`alerting.rules.provisionUiamApiKeys` feature flag is enabled and only
in Serverless mode.

### What's in scope

- **New task** `alerting:api_key_provisioning`
  - Runs every 1 minute (configurable interval).
  - Timeout: 5 minutes.
  - Only registered and started when `isServerless` is true.
- Scheduled when `alerting.rules.provisionUiamApiKeys` is enabled;
removed when the flag is disabled.

- **Provisioning flow**
1. Find rules that have `apiKey` but no `uiamApiKey`, and where the key
was not user-created (`apiKeyCreatedByUser !== true`).
2. Exclude rules that already have a provisioning status of `completed`
or `skipped` (via new saved object type).
3. Call the core UIAM convert API to convert legacy API keys to UIAM
format.
  4. Bulk-update rules with the new `uiamApiKey`.
5. Write provisioning status per rule (completed / failed / skipped).
THe SO is in the Alerting/Cases scope (Maybe should be independent since
we want to use it for TM provisioning too)
6. If a rule update fails after a successful conversion, the newly
created UIAM API key is marked for invalidation (orphan handling).

- **New saved object type**
`uiam-api-keys-provisioning-status` — one document per rule with
`entityId`, `entityType`, `status` (completed | failed | skipped),
optional `message`, and `@timestamp`. Used to avoid re-processing and to
report status.

- **Skipped rules**
  Rules are skipped (and status recorded) when:
  - No `apiKey`, or
  - Already have `uiamApiKey`, or
  - `apiKeyCreatedByUser === true`.

- **Batching**
Rules are read in batches of 300. If more rules remain, the task
reschedules itself 1 minute later.

### Out of scope / behavior

- **Non-Serverless**: Task is not registered; no behavior change.
- **Flag off**: Task is not scheduled; no conversion runs.
- **License**: If the UIAM convert API is unavailable (e.g. license),
the task throws and Task Manager will retry.

## To verify:

1. Add below configs to your serverless.dev.yml
```
xpack.alerting.rules.apiKeyType: 'es'
xpack.alerting.invalidateApiKeysTask.removalDelay: '10s'
xpack.alerting.invalidateApiKeysTask.interval: '1m'
feature_flags.overrides:
  alerting.rules.provisionUiamApiKeys: false
``` 
so we can create some rules with the ES API Keys.

2. Run Kibana and Elasticsearch with the below commands:
`yarn es serverless --projectType observability`
`yarn start --serverless oblt --run-examples`

3. Login with the `system_indices_superuser` user.
4. Create 4-5 rules and let them run at least for one time.
5. Stop your ES and Kibana
6. Update the below FF in your serverless.yml
```
feature_flags.overrides:
  alerting.rules.provisionUiamApiKeys: true
``` 
7. Update `GET_RULES_BATCH_SIZE` to `2` in
`uiam_api_key_provisioning_task.ts`
8. Run Kibana and Elasticsearch with the below commands:
`yarn es serverless --projectType observability --uiam`
`yarn start --serverless oblt --run-examples --uiam`
9. The task should run and update the first 2 rules with `uiamApiKey` 
```
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "alert"
      }
    }
  }
}
```
10. Wait for 1m, the task should run again and update the next 2 rules.
11. You can observe the provisioning status with:
```
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "uiam_api_keys_provisioning_status"
      }
    }
  }
}
```
12. If your rules run often (every 1m) Rule update may fail. In this
case we set the provisioning status to failed and fetch the rule in the
next batch again. We also set the orphaned converted apiKeys to be
invalidated.
You can see them by using the below query:
```
GET /.kibana_alerting_cases_*/_search
{
  "query": {
    "match": {
      "type": {
        "query": "api_key_pending_invalidation"
      }
    }
  }
}
```

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Tyler Smalley <tyler.smalley@elastic.co>
jeramysoucy pushed a commit to jeramysoucy/kibana that referenced this pull request Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting Feature:CPS release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants