Skip to content

Conversation

@pchila
Copy link
Member

@pchila pchila commented Nov 27, 2025

What is the problem this PR solves?

This PR handles new available_rollbacks field sent by elastic-agent during checkin (implemented in PR elastic/elastic-agent#11143).
This information will be consumed by the Fleet UI to allow a manual rollback of recently upgraded agents.

How does this PR solve the problem?

This PR introduces new available_rollbacks field both in CheckinRequest and pendingT structs as a slice of AvailableRollbacks and a JSON array of objects respectively.
On Elasticsearch the documents in .fleet-agents will have a new field upgrade.rollbacks similar to:

...
"upgrade": {
            "rollbacks": []
  },
..

How to test this PR locally

Refer to PR elastic/elastic-agent#11143 How to test this PR locally section.

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

Related issues

@pchila pchila self-assigned this Nov 27, 2025
@pchila pchila added enhancement New feature or request Team:Elastic-Agent Label for the Agent team backport-skip Skip notification from the automated backport with mergify Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Nov 27, 2025
@pchila pchila linked an issue Nov 27, 2025 that may be closed by this pull request
@pchila pchila force-pushed the support-agent-manual-rollback branch from 529235c to 68867ee Compare December 1, 2025 08:28
@pchila pchila marked this pull request as ready for review December 1, 2025 13:37
@pchila pchila requested a review from a team as a code owner December 1, 2025 13:37
@pchila pchila requested a review from michalpristas December 4, 2025 06:07
@pchila pchila force-pushed the support-agent-manual-rollback branch from 68867ee to 79698cd Compare December 4, 2025 13:04
@blakerouse
Copy link
Contributor

I am actually good with this PR. I don't want to override @michalpristas comments about the available rollbacks, but if he also agrees to keep them seperate then I am +1

@michalpristas
Copy link
Contributor

I'm good with it paolo explained reasoning offline

michalpristas
michalpristas previously approved these changes Dec 9, 2025
Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good besides my one remaining comment.

When we are done with these changes, it would be good to update the horde drones to report as having available rollbacks so that the scale tests exercise these code paths.

@pchila pchila force-pushed the support-agent-manual-rollback branch from 8a1c4fd to 7009b95 Compare December 11, 2025 06:46
cmacknz
cmacknz previously approved these changes Dec 11, 2025
@pchila pchila merged commit c10deba into elastic:main Dec 12, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify enhancement New feature or request skip-changelog Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add rollback field to actionUpgrade

5 participants