Skip to content

Conversation

@SimonKoetting
Copy link
Contributor

Currently, Sysmon collection is enabled by default when you add the Windows integration.
Since Version 9 this can cause the Agent to be in an unhealthy state in fleet when the Channel does not exist.
See: elastic/elastic-agent#7448

As sysmon isn't installed on Windows per default, it also shouldn't be enabled in the integration as default to improve user experience when they just add the Windows integration without changing the default values.

@SimonKoetting SimonKoetting requested review from a team as code owners May 13, 2025 06:57
@SimonKoetting SimonKoetting requested review from belimawr and rdner May 13, 2025 06:57
@elastic-vault-github-plugin-prod

🚀 Benchmarks report

Package windows 👍(6) 💚(2) 💔(1)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
forwarded 1240.69 915.75 -324.94 (-26.19%) 💔

To see the full report comment with /test benchmark fullreport

@elasticmachine
Copy link

💚 Build Succeeded

@elastic-sonarqube
Copy link

@andrewkroh andrewkroh added Integration:windows Windows Team:Security-Windows Platform Security Windows Platform team [elastic/sec-windows-platform] labels May 13, 2025
@elasticmachine
Copy link

Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform)

@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] label May 14, 2025
@elasticmachine
Copy link

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@cmacknz
Copy link
Member

cmacknz commented May 20, 2025

This feels like it is a breaking change for users who actually do use sysmon. It is less disruptive for people without sysmon to turn it off, then to turn it off unconditionally and break anyone who relies on it.

I think ideally we'd be able to detect that sysmon isn't installed and just not run the input at all, but that would require a change in agent.

An even easier way to fix this that isn't a breaking change is to stop marking the winlog input as degraded/failed when it can't find the sysmon channel.

@SimonKoetting
Copy link
Contributor Author

@cmacknz but this change would only apply for all new policies, right?
So all current installations won't notice any change, meaning also all users who already have a policy in place will get the error.
So the change is only, new users who installed sysmon would have to turn it on. All other users with a basic windows installation can just use the defaults and won't get any errors.

@cmacknz
Copy link
Member

cmacknz commented May 21, 2025

If that is how it works, then that is what we want to happen. I made an assumption that by explicitly adding a new default for a configuration value that wasn't specified before, we'd update that field value to the new explicit value everywhere (new and old installs).

If the upgrade behavior handles this properly that would be ideal, did we explicitly test the behavior on upgrade here to confirm this is how it works?

@SimonKoetting
Copy link
Contributor Author

@cmacknz, yeah, but we don't introduce a new value here. We just change the default value. Therefore, the value is already set in all policies.
Just tried that locally to double-check, as soon as I create a windows' policy without adjusting anything, the data stream sysmon_operation has
enabled: true
set.
When I manually install the new package version and update the integration policy, the value keeps being
enabled: true

Only when I create a new integration policy now, it's set to
enabled: false
when I change nothing

@cmacknz
Copy link
Member

cmacknz commented May 23, 2025

If works out to be purely a change of default for new installations and upgrades of the package work properly, then LGTM, no concerns.

@SimonKoetting
Copy link
Contributor Author

Pinging @elastic/sec-windows-platform

@bjmcnic
Copy link
Contributor

bjmcnic commented Jun 4, 2025

This is reminding me of an earlier issue in 8.16 (elastic/beats#41468). In regards to upgrades, I see that your testing confirms the integration enabled setting will be maintained. What does that mean for someone coming from an 8.16 and upgrading to 9.1? Will they suddenly start seeing thing go unhealthy because they persisted the enabled state for Sysmon even though they never had Sysmon installed, and now that's newly reported as a problem?

@SimonKoetting
Copy link
Contributor Author

Sadly, that's what's going to happen. All already existing installations/policy definitions with sysmon enabled with get the error as soon as they upgrade when they have no sysmon installed on the system but enabled in a policy.

@rdner rdner removed their request for review June 24, 2025 12:02
@botelastic
Copy link

botelastic bot commented Jul 24, 2025

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jul 24, 2025
@andrewkroh andrewkroh added the bugfix Pull request that fixes a bug issue label Aug 7, 2025
@botelastic botelastic bot removed the Stalled label Aug 7, 2025
@botelastic
Copy link

botelastic bot commented Sep 6, 2025

Hi! We just realized that we haven't looked into this PR in a while. We're sorry! We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1. Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Sep 6, 2025
@botelastic
Copy link

botelastic bot commented Oct 6, 2025

Hi! This PR has been stale for a while and we're going to close it as part of our cleanup procedure. We appreciate your contribution and would like to apologize if we have not been able to review it, due to the current heavy load of the team. Feel free to re-open this PR if you think it should stay open and is worth rebasing. Thank you for your contribution!

@botelastic botelastic bot closed this Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Pull request that fixes a bug issue Integration:windows Windows Stalled Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] Team:Security-Windows Platform Security Windows Platform team [elastic/sec-windows-platform]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants