-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KernelSymbolTable helper is conceptually broken #3798
Comments
@rafaeldtinoco I am trying to understand. |
There are multiple addresses for symbols with the same name. Symbols might, or might not, have the same owner. The examples above are all from 'system' and show the amount of addresses each symbol has. |
IIRC binary searching is only done for address searching. When searching by name, the search should be linear. There's also a linear search after the binary search if it fails. |
Both cases exist. I changed the issue description to reflect it: Multiple symbols same address, multiple addresses same symbol. I'm just making it fast, parallelized, thread-safe and making sure it works. Feel free to optimize it further if/when needed. I don't think there should be multiple flavors as well, the unique flavor should be the optimal and used in all cases. |
With aquasecurity/libbpfgo#399 merged I believe I can fix this issue in Tracee by giving the symbol offsets in the kprobe attachment. So this issue would be solved by #3653 only. |
The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses). Checking kernel_symbols.go and the KernelSymbolTable interface, one could change GetSymbolByName to return a slice of []*KernelSymbols and that would be an easy change for fullKernelSystemTable. Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address. Quickly checking kallsyms for duplicate symbol addresses, the generic names will have huge amount of duplicates: 1693 __func__.0 1198 _entry.1 834 __func__.2 777 _entry.3 ... and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations): 2 switch_mm 2 sw_fence_dummy_notify 2 suspend_attrs 2 suspend_attr_group 2 subsystem_id_show 2 str__i915__trace_system_name ... There is also the case where the same address has multiple symbols: ffffffffc0310200 b __key.22 [drm_display_helper] ... ffffffffc0310200 b __key.17 [drm_display_helper] ffffffffc0310200 b drm_dp_aux_dev_class [drm_display_helper] ... So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results. This change makes the helper "fast enough" while allowing it to return multiple values from its maps. Related: aquasecurity/tracee#3798
Addressed by #3802 |
Background
I'm working on fixing #3653, and I can create libbpfgo methods to attach kprobe on specific offsets, with:
and
creating methods:
but Tracee "lazy ksymbols" logic won't work for symbols having two addresses.
Reason
The whole "lazy" concept for the kallsyms file relies in the concept that symbols would have a single address only (and the assumption that the file is mostly sorted by symbol addresses).
Checking
kernel_symbols.go
and theKernelSymbolTable
interface, one could changeGetSymbolByName
to return a slice of[]*KernelSymbols
and that would be an easy change forfullKernelSystemTable
.Problem is that the lazy implementation relies in stopping to read the kallsym file once a symbol is picked, or in a binary search of address considering file is sorted, etc. That doesn't work well for the same symbol having more than 1 address.
Quickly checking kallsyms for duplicate symbol addresses, the
generic names
will have huge amount of duplicates:and, the "unique kernel symbols" will have 2 or 3 addresses (under certain circumstances, like when the symbol is static to a source file, or under certain compilation optimizations):
There is also the case where the same address has multiple symbols:
So in both cases, when indexing by sym name, or by sym address, code should account for the possibility of having multiple results.
The text was updated successfully, but these errors were encountered: