Skip to content

feat(elfunwindinfo): allow capturing an FDE during deltas extraction#544

Closed
korniltsev wants to merge 3 commits intoopen-telemetry:mainfrom
grafana:korniltsev/fde_capture_hook
Closed

feat(elfunwindinfo): allow capturing an FDE during deltas extraction#544
korniltsev wants to merge 3 commits intoopen-telemetry:mainfrom
grafana:korniltsev/fde_capture_hook

Conversation

@korniltsev
Copy link
Copy Markdown
Contributor

This is a helper for #416

The new functionality is useful if someone knows an address in the middle of a function but does not know the range of the function and would like to find out the range of the function. (for example _PyEval_EvalFrameDefault.cold function, see the linked issue and comments)

@korniltsev korniltsev requested review from a team as code owners June 20, 2025 12:39
Copy link
Copy Markdown
Contributor

@fabled fabled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is a good API or implementation. This more or less walks the full FDE table always - even when just looking up single FDE. I'd much rather have the code provide a LookupFDE or similar call point to just do a binary search using virtual address from the .eh_frame_hdr and return the FDE (either just start/end, or a struct with the basic data).

The current split of not alwaysing using the lookup table is because the need to support .debug files. ELF executables with .eh_frame should always have the binary search table per specification.

@korniltsev
Copy link
Copy Markdown
Contributor Author

This more or less walks the full FDE table always - even when just looking up single FDE.

I am not proposing to walk the full FDEs second time. I'm actually proposing the opposite - do not walk FDEs second time at all. We already process them once for the deltas. This API allows to save a bit of information along the way during stack deltas extraction.

Feels weird and wrong to process FDEs two times.(even if the second one will be a lightweight binary search).

Please take another look at the current proposal and give it another thought and let me know if you strongly believe the current approach is not acceptable. I will take a look into your LookupFDE idea then.

@fabled
Copy link
Copy Markdown
Contributor

fabled commented Jun 20, 2025

This API allows to save a bit of information along the way during stack deltas extraction.

So I was expecting you to do another scan from the plugin needing its information. Apologies if this felt as being prejudiced. I think this is the problem of submitting a proposed API without seeing how its intended to be used.

My thinking process was:

  • the current code structure does not allow you to preconfigure the FDE walker from interpreter plugin: the deltas are parsed and loaded to eBPF first, only after that the interpreter plugins get hooks
  • the PR code does not suggest storing per-FDE information into heap. This would also not work because the stack delta extraction happens for huge ELF files with over 200k FDEs
  • having interpreter specific code in the generic code is not feasible

This lead me to the conclusion of intention to call the stack delta extraction a second time from Python plugin. But sounds like this was invalid assumption? Did I miss something? Do you have a way to configure the initial stack extraction with interpreter specific FDE interest (without putting Python specifics in the generic code)?

But even if the above was possible easily, I think the option of just capturing one FDE with one address point is not ideal. We may need to have multiple ranges depending on needs of future plugins.

Given the above I constraints, I think just implementing the simple binary search seems more generic:

  • it works in existing plugin architecture
  • you can call it multiple times making sure it is useful for potential future needs

Also, the pfelf.File is cached and mmapped. It does not need to be reopened. So the overhead is really just the one binary lookup I think the performance cost of one binary lookup is minimal, compared to the flexibility the API will allow.

@korniltsev
Copy link
Copy Markdown
Contributor Author

#545

@korniltsev korniltsev closed this Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants