Skip to content

interpreters, reporter: intern symbolization strings#563

Merged
fabled merged 2 commits intoopen-telemetry:mainfrom
fabled:symbolization-interning
Jul 14, 2025
Merged

interpreters, reporter: intern symbolization strings#563
fabled merged 2 commits intoopen-telemetry:mainfrom
fabled:symbolization-interning

Conversation

@fabled
Copy link
Copy Markdown
Contributor

@fabled fabled commented Jun 27, 2025

Introduce and use libpf.String to intern strings in symbolization and mappings (using unique.Handle[string]). This reduces memory usage and allows the string interning to work better as we keep the handles long term.

fixes #404

@fabled
Copy link
Copy Markdown
Contributor Author

fabled commented Jun 28, 2025

@florianl @christos68k Can you take a brief look at this when you get a moment? If the ideal seems acceptable, lets agree if the Handle[string] should be a type with set of helpers and pre-looked up constants in libpf.

Some considerations are:

  • symbol strings are interned, and all the modules that do not do separate souce filename caching, will implicitly end up using same string instance for these filenames
  • reporter will hold the handle in the LRU, so the interning system works correctly
  • memory usage will decrease also because Handle[T] is one pointer, where as string is two machine words (pointer + length)
  • interning will add one more map lookup, which adds a little bit of CPU usage (this is in the slow paths, where this would definitely not add CPU usage significantly)
  • the interning global map might also have little memory overhead, but the saving from above points should still be much more than this overhead

If going this direction, we could revisit interpreters if this is enough to justify dropping some more LRUs which would reduce memory overhead even further.

@christos68k
Copy link
Copy Markdown
Member

@florianl @christos68k Can you take a brief look at this when you get a moment? If the ideal seems acceptable, lets agree if the Handle[string] should be a type with set of helpers and pre-looked up constants in libpf.

SGTM

I think it's fine to wrap all this up in libpf, maybe a separate file e.g. intern.go

@fabled
Copy link
Copy Markdown
Contributor Author

fabled commented Jul 3, 2025

I think it's fine to wrap all this up in libpf, maybe a separate file e.g. intern.go

I'll do something like libpf.String then?

Also, I think this is a nice step towards implementing #384.

@fabled fabled force-pushed the symbolization-interning branch 2 times, most recently from da3d543 to fb36ebf Compare July 5, 2025 17:43
fabled added 2 commits July 5, 2025 20:44
Use the interned unique.Handle[string] for strings in symbolization.
This reduced memory usage and allows the string interning to work
better when we pass the Handle to the reporter LRUs.
@fabled fabled force-pushed the symbolization-interning branch from fb36ebf to 91aacd8 Compare July 5, 2025 17:44
@fabled fabled marked this pull request as ready for review July 5, 2025 17:54
@fabled fabled requested review from a team as code owners July 5, 2025 17:54
Comment thread libpf/string.go
// but provides String() to be usable as printf, and also treats the default
// initializer as the empty string.
type String struct {
value unique.Handle[string]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For background, one might want to read https://go.dev/blog/unique

But to give a bit of additional context:

  • the libpf.String encapsulates a unique.Handle[string] which basically is a string *. Thus libpf.String is 8 bytes where as string (header) is 16 bytes (pointer + length) on 64-bit systems.
  • the overhead of the standard unique.Make is to make create a new unique string header
  • unique also special cases strings and will do a string clone on all strings it sees inside the T of Handle[T]. this applie also when T = string.
  • so strictly speaking this implementation does not intern strings, but the string headers and clones the string data
  • because of this construct, the Handle[T] needs to be kept around. otherwise GC will collect the string header and cause it to get recreated
  • so memory usage less by converting string to Handle, but more due to the unique package creating new string headers and the associated hash overhead; also all strings are copied even if not needed (string literals, or strings already allocated to correct length)
  • one primary optimization aspect was that Handle[T] can be tested for equality by comparing the handles directly (the pointer value), but I think we don't need this feature too much

One article mention that there is still need for transparent string interning. Eg. func Intern(str string) string type of function that returns a regular string header with just the data being interned. I did try searching for such implementation but did not find one yet. This will lose the fast equality to test, but often the strings are not compared for equality.

If preferred, I could spend a bit of time to determine if this could be implemented using the new Golang primitives weak, unique, etc. in a reasonable amount of time. The benefit would be less intrusive code changes. And a minor optimization code be to have separate Intern (use existing string data) and InternClone (clone string data if needed)). Thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the transparent string interning is not feasible to do. The problem is the map key containing a strong references to the sting data. The GC magic to have interned values deleted depends on the Handle[T] pointer indirection, and the clean up happens as follows:

  • all Handle[T] instances disappear which are the strong references to the T
  • GC will collect the T instance after all Handles are gone
  • this results the map still having an entry with the key having strong pointer to the string, but the weak pointer to *string is null
  • GC hook will periodically clean the intern map from entries where the value is null, this will delete the map entry
  • only after the above, apparently during a next GC run, the string gets freed as the key no longer holds the strong reference

So the string data to be released, appears to require to GC iterations for full memory release.

It seems to not be possible to have a map where the key would be a weak reference. Perhaps the Go runtime will support something specialized for the transparent string interning in the future. But the way to go at this time, would be this PR.

Comment thread process/process.go
path = strings.Clone(path)
lastPath = path
}
path = libpf.Intern(trimMappingPath(fields[5]))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted earlier, Intern will clone strings so local optimizations like this can just go away.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for making it better understandable (there already has been confusion when introducing this).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add benchmark code / results to prove that there are no side-effects regarding CPU or memory usage?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help to do this and/or ideas how to do this are welcome. This would require a longer running sample because the memory benefits add up long term. On very brief test on my local machine, this seemed to reduce memory usage and no noticeable CPU usage regression. But the load on new/old profiler are not identical.

I suspect that any CPU time lost on interning, is typically gained by the reduced stress on GC.

If someone could run this for few hours in test server, and compare the numbers from longer time, that'd be super helpful.

Comment thread libpf/string.go
Copy link
Copy Markdown
Member

@christos68k christos68k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fabled fabled merged commit 14bf850 into open-telemetry:main Jul 14, 2025
27 checks passed
nsavoire added a commit to DataDog/dd-otel-host-profiler that referenced this pull request Aug 18, 2025
* backport: replace per-fileID LRU with a global LRU

open-telemetry/opentelemetry-ebpf-profiler#529

* backport: interpreters, reporter: intern symbolization strings

open-telemetry/opentelemetry-ebpf-profiler#563

* Disable Go interpreter because we are doing Go symbolization remotely.

* Update opentelemetry-ebpf-profiler with latest changes from upstream.

* Update 3rdparty licenses.

* backport: Refactor symbol caching

open-telemetry/opentelemetry-ebpf-profiler#635

* Use containerID provided by eBPF profiler when available and split by service is enabled.

* Do not collect Go labels by default
gnurizen pushed a commit to parca-dev/opentelemetry-ebpf-profiler that referenced this pull request Sep 8, 2025
gnurizen pushed a commit to parca-dev/opentelemetry-ebpf-profiler that referenced this pull request Sep 8, 2025
gnurizen pushed a commit to parca-dev/opentelemetry-ebpf-profiler that referenced this pull request Sep 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mappings: Consider a global cache to de-duplicate paths

4 participants