Skip to content

fix: reduce log noise and filename duplication#467

Merged
fabled merged 7 commits intoopen-telemetry:mainfrom
areebahmeddd:issues/466/error-logging
May 29, 2025
Merged

fix: reduce log noise and filename duplication#467
fabled merged 7 commits intoopen-telemetry:mainfrom
areebahmeddd:issues/466/error-logging

Conversation

@areebahmeddd
Copy link
Copy Markdown
Contributor

@areebahmeddd areebahmeddd commented May 25, 2025

Signed-off-by: Areeb Ahmed areebshariff@acm.org

Summary of Changes

Changes Golang interpreter loader error handling to reduce log noise and eliminate filename duplication:

  • Change log level from error to debug for "failed to create point resolver" messages in execinfomanager
  • Remove executable path from Go interpreter error message to prevent duplication (path is already logged by execinfomanager)
  • Add function to map standard go errors (C.SYMBLIB_ERR_IOFILENOTFOUND, C.SYMBLIB_ERR_OBJFILE, C.SYMBLIB_ERR_DWARF)

Related Issue(s)

Closes #466

@areebahmeddd areebahmeddd requested review from a team as code owners May 25, 2025 10:48
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented May 25, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Copy Markdown
Member

@florianl florianl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While changing the visibility of the log line from Error to Debug, reduces the logged output, if the OTel eBPF profiler is not running in verbose mode, it does not resolve the underlying issue, that causes the error log line.

The log lines in #466 indicate a race condition somewhere between Go and Rust (?). As a result, symbolization will fail for this part. I would prefer, if we could fix the race condition and guarantee less symbolization issues, rather changing the visibility for issue.

@rockdaboot
Copy link
Copy Markdown
Contributor

rockdaboot commented May 26, 2025

The log lines in #466 indicate a race condition somewhere between Go and Rust (?). As a result, symbolization will fail for this part. I would prefer, if we could fix the race condition and guarantee less symbolization issues, rather changing the visibility for issue.

@florianl are you sure this isn't just an expected TOCTOU? If it is, using debug level seems fine to me. Otherwise, I agree with you that the issue should be fixed and the log should stay at error level.

@areebahmeddd You are addressing two different issues in your PR, a) simplifying the error message and b) changing the log level / user-visibility.
Since b) is debatable and needs further considerations, I suggest to restore the log level back to error in order to proceed.

@areebahmeddd
Copy link
Copy Markdown
Contributor Author

@rockdaboot updated the pr! ✅

@florianl umm not very familiar with the codebase so from my understanding status code 2 indicates symblib err file not found, so a straightforward fix would be adding a retry mechanism that waits for extracted files to become accessible before passing them to symblib?

@florianl
Copy link
Copy Markdown
Member

florianl commented May 26, 2025

are you sure this isn't just an expected TOCTOU?

If you compare the current Rust based approach with #456, you will notice that this isn't just an expected TOCTOU - otherwise you would see it with #456 as well.

[..] a straightforward fix would be adding a retry mechanism that waits for extracted files to become accessible before passing them to symblib?

#455 in combination with #456 should resolve #466.

@rockdaboot
Copy link
Copy Markdown
Contributor

#455 in combination with #456 should resolve #466.

I see, thanks for pointing this out.
Since we don't know when that will be merged, it won't hurt to improve the log message right now. Not a strong "must have", though. It's just a little detail. So I will approve, but @areebahmeddd feel free to close this PR in favor of #456.

@areebahmeddd
Copy link
Copy Markdown
Contributor Author

Anything works! : )

Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change on the error message looks good. But the status should be inspected and a equivalent Golang error should be instead embedded if possible.

Comment thread interpreter/go/go.go Outdated
Comment on lines +71 to +73
if status != C.SYMBLIB_OK {
return nil, fmt.Errorf("failed to create point resolver for '%s': %d",
exec, status)
return nil, fmt.Errorf("failed to create point resolver: %d",
status)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the status should be mapped to standard Golang errors where possible. E.g. the -2 indicating file not found, should return an error with os.ErrNotExist embedded in it. This allows the main code path to omit the error logging: its already testing for this specific error code and its not logged.

The root case for this error is indeed race condition. It happens when the earlier code path has opened the ELF file with pfelf.Open successfully, but then during processing of stack deltas or other data the process exits. This causes the above function fail as the mappings file does not exist anymore.

In other interpreters this is already handled correctly, and it does not manifest if this is converted to Go code, because then there is no more attempt to re-open the ELF file with a filename - the cached pfelf.File or the underlying ReaderAt will get used.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I've created a function to map standard go errors! Hope it's matches your requirement 🙂

@fabled
Copy link
Copy Markdown
Contributor

fabled commented May 26, 2025

#455 in combination with #456 should resolve #466.

I think #456 will take quite a bit of time. Using the Golang built in library will likely explode memory usage. We really need to implement the code in our own code using the mmapped file. The same way the Rust code does it.

I think it makes sense to merge this first. Ideally with the mapping of status to os.ErrNotExist when appropriate.

Comment thread interpreter/go/go.go Outdated
Comment thread interpreter/go/go.go Outdated
Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@fabled fabled merged commit afbda36 into open-telemetry:main May 29, 2025
25 checks passed
@areebahmeddd areebahmeddd deleted the issues/466/error-logging branch May 29, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix "failed to create point resolver for ..."

4 participants