Skip to content

[Fleet] Create API to report status of integrations synchronization#216178

Merged
criamico merged 33 commits into
elastic:mainfrom
criamico:192363_integrations_sync_status
Apr 7, 2025
Merged

[Fleet] Create API to report status of integrations synchronization#216178
criamico merged 33 commits into
elastic:mainfrom
criamico:192363_integrations_sync_status

Conversation

@criamico
Copy link
Copy Markdown
Member

@criamico criamico commented Mar 27, 2025

Closes #192363

Summary

Add endpoint that compares integrations installed on remote cluster with integrations in ccr index fleet-synced-integrations-ccr-<outputId>. Feature flag: enableSyncIntegrationsOnRemote

  • Use the ccr info api to check that the ccr index is active
  • Compare the content of the two indices and report the sync status for each integration:
GET kbn:/api/fleet/remote_synced_integrations/status

{
  "integrations": [
    {
      "package_name": "akamai",
      "package_version": "2.28.0",
      "updated_at": "2025-03-27T10:29:52.485Z",
      "sync_status": true
    },
    {
      "package_name": "auth0",
      "package_version": "1.21.0",
      "updated_at": "2025-03-26T12:06:26.268Z",
      "sync_status": false,
      "error": "Installation status: not_installed" 
    },
]

Testing

Setup local env with the guide added in dev_docs (preview)

  • Install some integrations on local cluster, wait that they are synced on remote
  • From remote cluster dev tools, run
GET kbn:/api/fleet/remote_synced_integrations/status
  • To verify that custom assets are synced choose an integration, for instance system
  • From the package policy select a var, advanced options and add a custom mapping and a custom pipeline. In my example I used system
Screenshot 2025-04-01 at 11 18 40
  • Run the endpoint again and you should see the status of custom assets too:
{
  "integrations": [
    {
      "package_name": "akamai",
      "package_version": "2.28.0",
      "updated_at": "2025-03-27T10:29:52.485Z",
      "sync_status": "completed"
    },
    {
      "package_name": "elastic_agent",
      "package_version": "2.2.0",
      "updated_at": "2025-03-26T14:06:29.216Z",
      "sync_status": "completed"
    },
    {
      "package_name": "synthetics",
      "package_version": "1.4.1",
      "updated_at": "2025-03-26T14:06:31.909Z",
      "sync_status": "completed"
    },
    {
      "package_name": "system",
      "package_version": "1.67.3",
      "updated_at": "2025-03-28T10:08:00.602Z",
      "sync_status": "completed"
    }
  ],
  "custom_assets": {
    "component_template:logs-system.auth@custom": {
      "name": "logs-system.auth@custom",
      "type": "component_template",
      "package_name": "system",
      "package_version": "1.67.3",
      "sync_status": "completed"
    },
    "ingest_pipeline:logs-system.auth@custom": {
      "name": "logs-system.auth@custom",
      "type": "ingest_pipeline",
      "package_name": "system",
      "package_version": "1.67.3",
      "sync_status": "completed"
    }
  }
}

Checklist

@criamico criamico added the Team:Fleet Team label for Observability Data Collection Fleet team label Mar 27, 2025
@criamico criamico self-assigned this Mar 27, 2025
@criamico
Copy link
Copy Markdown
Member Author

@elasticmachine merge upstream

if (!installedIntegrations) {
return {
items: [],
error: `No integrations installed on remote`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not necessarily an error if the main cluster doesn't have any integrations either

Comment thread x-pack/platform/plugins/shared/fleet/dev_docs/local_setup/remote_clusters_ccr.md Outdated

try {
const installedPipelines = await getPipeline(esClient, abortController);
const installedComponentTemplates = await getComponentTemplate(esClient, abortController);
Copy link
Copy Markdown
Member Author

@criamico criamico Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for the solution of fetching all the pipelines and component templates and keep them on memory, instead of doing a call for each one in the loop below. @juliaElastic do you think it could become a performance issue?

Copy link
Copy Markdown
Contributor

@juliaElastic juliaElastic Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only compare pipelines and component templates that match '*@custom', like it's done in custom_assets.ts. The other assets are already installed when the package is installed, no need to compare them.

Object.entries(ccrCustomAssets).forEach(([ccrCustomName, ccrCustomAsset]) => {
if (ccrCustomAsset.type === 'ingest_pipeline') {
const installedAsset = installedPipelines[ccrCustomAsset?.name];
if (isEqual(installedAsset?.processors, ccrCustomAsset?.pipeline?.processors)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipelines have an optional version which we can use to compare like here

(existingPipeline.version && existingPipeline.version < customAsset.pipeline.version) ||

we should probably have a common logic to compare here and custom_assets.ts

@criamico criamico changed the title 192363 integrations sync status [Fleet] Create API to report status of integrations synchronization Apr 1, 2025
@criamico
Copy link
Copy Markdown
Member Author

criamico commented Apr 1, 2025

@elasticmachine merge upstream

@criamico criamico added v9.1.0 release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting labels Apr 1, 2025
@criamico criamico marked this pull request as ready for review April 1, 2025 16:05
@criamico criamico requested a review from a team as a code owner April 1, 2025 16:05
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/fleet (Team:Fleet)

@criamico
Copy link
Copy Markdown
Member Author

criamico commented Apr 3, 2025

@juliaElastic based on the new requirements in #217025 we should be able to query by output_id:

Create API that queries remote kibana sync status API by output ID (to be used by the UI to show status)
e.g. /api/fleet/remote_synced_integrations/{output_id}/status -> https://{remote_kibana}/api/fleet/remote_synced_integrations/status

In a previous commit it was already by output_id. Do you think we'll need to keep the general status? Otherwise I'll change it directly in this PR.

@criamico
Copy link
Copy Markdown
Member Author

criamico commented Apr 4, 2025

@elasticmachine merge upstream

elasticmachine and others added 3 commits April 4, 2025 07:26
…t --include-path /api/status --include-path /api/alerting/rule/ --include-path /api/alerting/rules --include-path /api/actions --include-path /api/security/role --include-path /api/spaces --include-path /api/streams --include-path /api/fleet --include-path /api/dashboards --update'
@juliaElastic
Copy link
Copy Markdown
Contributor

@juliaElastic based on the new requirements in #217025 we should be able to query by output_id:

Create API that queries remote kibana sync status API by output ID (to be used by the UI to show status)
e.g. /api/fleet/remote_synced_integrations/{output_id}/status -> https://{remote_kibana}/api/fleet/remote_synced_integrations/status

In a previous commit it was already by output_id. Do you think we'll need to keep the general status? Otherwise I'll change it directly in this PR.

We can keep as is in the current pr, as it collects the status in the remote cluster. The new API by output_id will only call the remote API using the remote output kibana url and API key.

@criamico
Copy link
Copy Markdown
Member Author

criamico commented Apr 4, 2025

@elasticmachine merge upstream

return { info: res.follower_indices[0] };
} catch (err) {
if (err?.body?.error?.type === 'index_not_found_exception')
throw new IndexNotFoundError(`Index not found`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we return the error message instead of throwing an error?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's handled here: https://github.com/elastic/kibana/pull/216178/files#diff-f8de6a6d308d65b9d61c400a2fdbebe1078e67eaf036537c028476260be254f5R312-R315

I left the throw block and handled outside because we might need this function elsewhere, this is a basic utility for the ccr case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, I added some unit tests to cover these cases

};
} else if (
installedPipeline?.version &&
installedPipeline.version < ccrCustomAsset.pipeline.version
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the version comparison should be done before the equality check, so we can skip the equality check if the version is not equal

if (ccrCustomAsset.is_deleted === true && installedPipeline) {
return {
...result,
sync_status: 'failed' as SyncStatus.FAILED,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be synchronizing (the deletion might not have happened yet) unless we know there was an error deleting?

if (ccrCustomAsset.is_deleted === true && installedCompTemplate) {
return {
...result,
sync_status: 'failed' as SyncStatus.FAILED,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, should these be synchronizing unless we know there was an error deleting?

@criamico
Copy link
Copy Markdown
Member Author

criamico commented Apr 7, 2025

@elasticmachine merge upstream

@@ -255,6 +250,11 @@ const compareCustomAssets = ({
sync_status: 'failed' as SyncStatus.FAILED,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be synchronizing too

return {
...ccrIntegration,
sync_status: 'failed' as SyncStatus.FAILED,
error: `Installation status: ${localIntegrationSO?.attributes.install_status}`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we return the error message if install_status: install_failed from latest_install_failed_attempts or latest_executed_state?

latest_install_failed_attempts?: InstallFailedAttempt[];

Copy link
Copy Markdown
Contributor

@juliaElastic juliaElastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the updates

@criamico criamico enabled auto-merge (squash) April 7, 2025 16:07
@criamico criamico merged commit ab6f7c6 into elastic:main Apr 7, 2025
@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Jest Tests #7 / Templates renders empty templates correctly

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
fleet 1199 1201 +2

History

cc @criamico

criamico added a commit that referenced this pull request Apr 15, 2025
…ut ID (#217799)

Part II of #217025

## Summary

Create API that queries remote kibana sync status API by output ID. 

From the main cluster we call the remote kibana (simply using
node-fetch) and query the endpoint added in
#216178; this way the main cluster
can have the status of the synced integrations on the remote cluster.

### Testing
Note that dev_docs now have a guide to setup locally the remote
clusters:
https://github.com/elastic/kibana/blob/main/x-pack/platform/plugins/shared/fleet/dev_docs/local_setup/remote_clusters_ccr.md

- Follow the testing steps from [this
PR](#216178)
- Install some integrations on cluster A (main) and wait 5 minutes to
get `SyncIntegrationsTask` running
- Verify that cluster B (remote) has the same integrations installed.
From dev tools, run

```
GET kbn:/api/fleet/remote_synced_integrations/status
```
- Go on dev tools on cluster A and run the new endpoint - `remote_id` is
the id of the remote output configured on cluster A:
```
GET kbn:/api/fleet/remote_synced_integrations/<remote_id>/remote_status
```
The response should be the same as above


### Screenshot
On Remote cluster (Cluster B):
<img width="1183" alt="Screenshot 2025-04-10 at 15 40 46"
src="https://github.com/user-attachments/assets/60ea1c1e-9ccf-4bcf-8637-bc4079483e61"
/>

On main cluster (Cluster A):

<img width="1690" alt="Screenshot 2025-04-11 at 11 10 30"
src="https://github.com/user-attachments/assets/e72fd729-3486-41b0-9194-487233415a75"
/>



### Checklist

- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Fleet] Report status of integration synchronization in Fleet API

4 participants