[Elastic Agent] Fix invalid log level sent to endpoint#25854
[Elastic Agent] Fix invalid log level sent to endpoint#25854michalpristas merged 6 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/agent (Team:Agent) |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪💚 Flaky test reportTests succeeded. Expand to view the summary
Test stats 🧪
|
| return nil | ||
| } | ||
|
|
||
| f, err := os.OpenFile(s.syncPath, os.O_RDWR, 0777) |
There was a problem hiding this comment.
If you want to be more strict add O_DIRECT.
There was a problem hiding this comment.
looks like not available everywhere so i'll skip direct
|
It looks like the attempted fix is to add a |
| func (d *DiskStore) Save(in io.Reader) error { | ||
| tmpFile := d.target + ".tmp" | ||
|
|
||
| fd, err := os.OpenFile(tmpFile, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, perms) |
There was a problem hiding this comment.
You might want to open your file with O_DIRECT.
| errors.M(errors.MetaKeyPath, r.target)) | ||
| } | ||
|
|
||
| fd, err := os.OpenFile(r.target, os.O_CREATE|os.O_WRONLY, perms) |
There was a problem hiding this comment.
you have a bright eye, i found one missing i failed copy as well
| // low spec windows environment where agent is faster | ||
| // than filesystem and what we read is different(stale) | ||
| // than what we just wrote. | ||
| type SyncOnSaveStore struct { |
There was a problem hiding this comment.
Not sure we really need this type/functionality. The DiskStore did not fsync the file before closing it. Important files that are supposed to be valid even in case of crashes should always be synced before closing.
Not requiring additonal Wrappers/Decorators like this in order to function correctly, also reduces the chance of errors in the future (the less you need to know for using a type/interface correctly, the better).
|
@michalpristas Can you update the PR description on how your PR solves the issue? |
|
Also the PR/issue states that the log level was not passed correctly, with all the new fsyncs it looks like there might have been other problems that are fixed by that change (due to broken/invalid configuration file being passed). If so, please update the changelog entry and the PR description. |
[Elastic Agent] Fix invalid log level sent to endpoint (elastic#25854)
[Elastic Agent] Fix invalid log level sent to endpoint (elastic#25854)
What does this PR do?
The issue is described here #25583 and happens only on low spec environments.
Difficult to reproduce locally. QA is usually successful with this.
I believe there's a race in read-write files in between enroll, run where FS is a bit slower than our process.
Similar issue was fixed here: #24504
Solving this by syncing config files after each write so changes are present on Filesystem immediately.
Why is it important?
Without this agent reports empty log level as desired to endpoint
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.