Skip to content

[Fleet] Upgrade managed package policies in a background task#191097

Merged
nchaulet merged 11 commits intoelastic:mainfrom
nchaulet:feature-upgrade-managed-pacakge-policies-task
Aug 29, 2024
Merged

[Fleet] Upgrade managed package policies in a background task#191097
nchaulet merged 11 commits intoelastic:mainfrom
nchaulet:feature-upgrade-managed-pacakge-policies-task

Conversation

@nchaulet
Copy link
Member

@nchaulet nchaulet commented Aug 22, 2024

Summary

Resolve #188666

Upgrading managed package policies is taking a long time, even when trying to optimize the code. To avoid setup that are too long that PR move the process of upgrading managed package policies from the fleet setup to a background task triggered by the setup.

In addition to this I still implemented a few optimization to make that code path a little quicker:

  • Implemented a cache on getPackageInfo and getAssetsMap the cache is scopped to the task with small size so it should not impact to much Kibana memory

I capture a few APM trace of that background task with 100 synthetics package policies, and it went from ~5 minute to do the upgrade to ~1 minute

Todo

  • Write some integration tests for the upgradeManagedPackagePolicies
  • Unit test cache

@obltmachine
Copy link

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • /oblt-deploy : Deploy a Kibana instance using the Observability test environments.
  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@nchaulet
Copy link
Member Author

/ci

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make those key configurable through kibana config under fleet.internal.*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful, but is there a risk that the user adds a number too small or too big and creates some issue? If we decide to do it we should document it very clearly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think if we make this configurable it will be under an internal keyword probably that could be useful for supportability (SDH with a scenario we did not think of)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can start without making this configurable, those size are pretty small it should not have a huge impact

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason for this to change during an upgrade and that codepath was extremely slow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests that cover this removal? Just making sure that we don't inadvertently break anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not, but it seems that codepath was doing nothing, it seems it was only here for supporting an UI that do not exists anymore #190613

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking!

@nchaulet nchaulet force-pushed the feature-upgrade-managed-pacakge-policies-task branch from effdca7 to 4ef0d1b Compare August 22, 2024 19:45
@nchaulet
Copy link
Member Author

/ci

ignoreUnverified?: boolean;
prerelease?: boolean;
}): Promise<PackageInfo> {
const cacheResult = getPackageInfoCache(pkgName, pkgVersion);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious to get your though on that pattern for caching, it seems easier to use that passing packageInfo through all codebase

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, currently we have to pass the packageInfo object around and is not the best pattern. With this we could probably replace it.

@nchaulet nchaulet added release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team labels Aug 23, 2024
@nchaulet nchaulet self-assigned this Aug 23, 2024
@nchaulet nchaulet marked this pull request as ready for review August 23, 2024 13:39
@nchaulet nchaulet requested review from a team as code owners August 23, 2024 13:39
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@nchaulet
Copy link
Member Author

@elasticmachine merge upstream

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful, but is there a risk that the user adds a number too small or too big and creates some issue? If we decide to do it we should document it very clearly.

ignoreUnverified?: boolean;
prerelease?: boolean;
}): Promise<PackageInfo> {
const cacheResult = getPackageInfoCache(pkgName, pkgVersion);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, currently we have to pass the packageInfo object around and is not the best pattern. With this we could probably replace it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests that cover this removal? Just making sure that we don't inadvertently break anything.

@nchaulet nchaulet requested review from a team and criamico August 26, 2024 17:47
@nchaulet
Copy link
Member Author

@elasticmachine merge upstream

Copy link
Contributor

@criamico criamico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚢

@kibana-ci
Copy link

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #27 / Cloud Security Posture Test adding Cloud Security Posture Integrations CNVM CNVM AWS Hyperlink on PostInstallation Modal should have the correct URL

Metrics [docs]

Unknown metric groups

ESLint disabled in files

id before after diff
fleet 13 12 -1

Total ESLint disabled count

id before after diff
fleet 57 56 -1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @nchaulet

Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response ops changes LGTM.

@nchaulet nchaulet merged commit fe0d310 into elastic:main Aug 29, 2024
@nchaulet nchaulet deleted the feature-upgrade-managed-pacakge-policies-task branch August 29, 2024 12:09
@kibanamachine kibanamachine added v8.16.0 backport:skip This PR does not require backporting labels Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Fleet] Investigate upgrade managed package policy timing

7 participants