Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] 8.4 and 7.17.5/7.17.6 Windows Endpoints may wind up in a non-running state #29

Open
ferullo opened this issue Sep 22, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@ferullo
Copy link
Collaborator

ferullo commented Sep 22, 2022

8.4.0, 8.4.1, 8.4.2, 7.17.5, and 7.17.6 Windows Endpoints may wind up in a non-running state. When this happens, the Elastic Endpoint service cannot be started. Elastic Agent does not automatically detect this failure and reinstall Endpoint. To diagnose if this is happening, run the following PowerShell command as an Administrator:

PS C:\Users\user> Get-WinEvent Microsoft-Windows-CodeIntegrity/Operational | where Id -eq 3004 | where Message -match "elastic-endpoint.exe"


   ProviderName: Microsoft-Windows-CodeIntegrity

TimeCreated                      Id LevelDisplayName Message
-----------                      -- ---------------- -------
9/22/2022 10:47:35 AM          3004 Error            Windows is unable to verify the image integrity of the file \Device\HarddiskVolume3\Program Files\Elastic\Endpoint\elastic-endpo...
9/19/2022 2:10:14 PM           3004 Error            Windows is unable to verify the image integrity of the file \Device\HarddiskVolume3\Program Files\Elastic\Endpoint\elastic-endpo...

We're currently working to address the root cause, but it seems to be triggered by Elastic Agent upgrades and possibly system reboots.

Three different possible workarounds for this are below. Only of these three things is necessary:

  1. Clear out the invalid Endpoint install so it can be installed again by invoking Endpoint's uninstall command manually on the host. Once that completes restart Elastic Agent with the command c:\Program Files\Elastic\Agent\elastic-agent.exe restart. Elastic Agent will then automatically reinstall Endpoint, fixing the issue.

  2. Uninstall the Elastic and Cloud Security integration from the affected hosts then re-add the integration. This will also trigger an uninstall and reinstall of Endpoint on the host which will fix the issue. NOTE: there have been reports that uninstalling the Endpoint and Cloud Security integration may put Elastic Agent into an UNHEALTHY state. This is temporary and will go back to HEALTHY once the integration is added back.

  3. Downgrade to an unaffected Elastic Agent and Endpoint version.

@jdixon-86
Copy link

@ferullo In my environment the uninstall / reinstall resolution only fixes the issue temporarily. It seems when the device restarts it can end up stopped and degraded again in Fleet.

@ferullo
Copy link
Collaborator Author

ferullo commented Sep 30, 2022

Yeah, that's correct. Unfortunately, this issue can reoccur even after it is remediated via uninstall/reinstall. There is a fix an upcoming patch release. I'll comment back here when that is publicly available.

@ferullo
Copy link
Collaborator Author

ferullo commented Oct 19, 2022

This is fixed in 8.4.3 for the 8.4.0/8.4.1/8.4.2 versions. A fix for 7.17.5/7.17.6 is still in progress.

@ferullo
Copy link
Collaborator Author

ferullo commented Jun 20, 2023

This was fixed in 7.17.7 as well on Oct 25, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants