Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster and better location data #165

Merged
merged 12 commits into from
Nov 20, 2023
Merged

Faster and better location data #165

merged 12 commits into from
Nov 20, 2023

Conversation

emilk
Copy link
Collaborator

@emilk emilk commented Nov 19, 2023

Checklist

  • I have read the Contributor Guide
  • I have read and agree to the Code of Conduct
  • I have added a description of my changes and why I'd like them included in the section below

Summary

Before:
puffin-before

After:
puffin-after

Description of Changes

This PR:

  • Improve the function names for trait calls
  • Adds crate name to the file path
  • Adds line number to the location
  • Optimize profile_function

On my M1 MacBook Pro, profile_function goes from 86ns->58ns, about 50% faster. profile_scope goes from 60ns to 58ns, which is within noise levels.

The optimization is done by caching the results of the location gathering functions. This also allows us to make them more expensive (since they are now only run once), and therefor more powerful.

So where puffin used to output the location mod.rs, it now outputs cratename/…/modulename/mod.rs:124.

I also improved the function name for when calling a trait method on a type. Previously you'd only see TraitName>::function_name but now you'll see <ConreteTypeName as TraitName>::function_name.

The downside is more data being sent, but I think the improves usefulness of the location is worth it!

This is not a breaking change, so this can be released in a patch-release of puffin

@@ -718,9 +814,23 @@ macro_rules! profile_function {
};
($data:expr) => {
let _profiler_scope = if $crate::are_scopes_on() {
static mut _FUNCTION_NAME: &'static str = "";
static mut _LOCATION: &'static str = "";
static _INITITIALIZED: ::std::sync::Once = ::std::sync::Once::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to use thread local here to avoid the synchronization?

Copy link
Collaborator Author

@emilk emilk Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using thread_local, and performance seems to be 1-2% worse, but maybe within noise, and code is much nicer. However, it means duplicating the strings for each thread, and also it won't work once we implement #167, so I think this slightly more verbose version is better for now.

This is the code I used:

thread_local! {
    pub static _FUNCTION_NAME: &'static str = $crate::current_function_name!().leak();
    pub static _LOCATION: &'static str = format!("{}:{}", $crate::current_file_name!(), line!()).leak();
}

Some($crate::ProfilerScope::new(
    _FUNCTION_NAME.with(|x| *x),
    _LOCATION.with(|x| *x),
    $data,
))

@NiklasNummelin
Copy link

Nice! I have a branch here also with some other optimizations I want to finish up ASAP https://github.com/EmbarkStudios/puffin/tree/perf-improvement-test

Copy link
Contributor

@repi repi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice longer term if the function name processing functions would be able to be const fn, then no need for caching nor dynamic allocations and string creation at runtime for it.

but likely are quite far from that being supported in Rust. so this does seem to make sense here.

I have also been toying with an opposite idea of not doing any function parsing in here, just interning the static strings in the stream (#167), and having the client/viewer deal with prettifying the rust strings instead. that feels tighter and much simpler (once one has implemented the string interning for the format) and with no impact on the code being profiled.

but could still start with this approach!

puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
puffin/src/lib.rs Outdated Show resolved Hide resolved
emilk and others added 4 commits November 20, 2023 21:20
Co-authored-by: Johan Andersson <[email protected]>
Co-authored-by: Johan Andersson <[email protected]>
Co-authored-by: Johan Andersson <[email protected]>
@emilk emilk merged commit af1fdcc into main Nov 20, 2023
5 of 6 checks passed
@emilk emilk deleted the emilk/better-location branch November 20, 2023 20:52
pub fn short_file_name(name: &str) -> &str {
// TODO: "foo/bar/src/lib.rs" -> "bar/src/lib.rs"
#[inline(never)]
pub fn short_file_name(path: &'static str) -> String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this breaking API change has been published as non-semver-breaking 0.17.1 release?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, we should hide those functions with #[doc(hidden)] as don't see a reason to have them part of the public API.

hiding them in patch release is still a breaking change, though we are pre v1 and could give ourselves some flexibility and say they were considered internal details just exposed for the macros.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We rely on them to reimplement the macros while also calling our own implementation with it next to puffin. Any way we could keep this around?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd take it for that use case the static lifetime also doesn't work?

we could revert that specific change, it was more of style/semantics, not for any concrete benefit. didn't think of that these functions were part of the public interface.

@emilk we could revert it and publish another patch version without these to get back to the previous public API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd take it for that use case the static lifetime also doesn't work?

I think that's fine because we were calling the puffin::current_file_name!() macro. The expression it emits suddenly returns String while ProfilerScope::new() expects &str now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that also means we need to switch over to std::once::Once and format!(...).leak() here. Perhaps it is useful to see for us how much we can reuse, maybe I can actually call the puffin macro within our own macro.

Really the only thing we're doing extra in our macro is calling into https://crates.io/crates/superluminal-perf, from what I can tell (so that we don't have to annotate our functions with profiling markers for multiple tools).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilk can you yank the breaking release? I'm going through dependency upgrades and keep forgetting to run cargo update -p puffin --precise 0.17.0 to opt out of semver-breaking changes 😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - 0.17.1 is yanked; 0.18.0 published

@TimonPost
Copy link
Member

TimonPost commented Nov 21, 2023

This went in a bit too quickly and missed being able to review it this morning. For those types of changes let's try to do at least two reviews. I'll have a look at removing unsafe block and perhaps just going for the stream-info-once solution as proposed in #167.

plasticbox pushed a commit to alt9github/puffin that referenced this pull request Nov 29, 2023
### Checklist

* [x] I have read the [Contributor Guide](../CONTRIBUTING.md)
* [x] I have read and agree to the [Code of
Conduct](../CODE_OF_CONDUCT.md)
* [x] I have added a description of my changes and why I'd like them
included in the section below

### Summary
Before:
<img width="409" alt="puffin-before"
src="https://github.com/EmbarkStudios/puffin/assets/1148717/e5722ab6-3865-4f5d-a953-672e4c1e5c86">

After:
<img width="665" alt="puffin-after"
src="https://github.com/EmbarkStudios/puffin/assets/1148717/dc829830-d2cf-400d-b5d5-9f92a892ff60">


### Description of Changes

This PR:
* Improve the function names for `trait` calls
* Adds crate name to the file path
* Adds line number to the location
* Optimize `profile_function`

On my M1 MacBook Pro, `profile_function` goes from 86ns->58ns, about 50%
faster. `profile_scope` goes from 60ns to 58ns, which is within noise
levels.

The optimization is done by caching the results of the location
gathering functions. This also allows us to make them more expensive
(since they are now only run once), and therefor more powerful.

So where puffin used to output the location `mod.rs`, it now outputs
`cratename/…/modulename/mod.rs:124`.

I also improved the function name for when calling a trait method on a
type. Previously you'd only see `TraitName>::function_name` but now
you'll see `<ConreteTypeName as TraitName>::function_name`.

The downside is more data being sent, but I think the improves
usefulness of the location is worth it!

This is _not_ a breaking change, so this can be released in a
patch-release of `puffin`

---------

Co-authored-by: Johan Andersson <[email protected]>
emilk added a commit that referenced this pull request Dec 11, 2023
…scopes

I inadvertantly added this in #165,
somehow not noticing it until now 🤦
emilk added a commit that referenced this pull request Dec 11, 2023
…scopes

I inadvertantly added this in #165,
somehow not noticing it until now 🤦
mergify bot pushed a commit that referenced this pull request Dec 11, 2023
…scopes (#175)

### Checklist

* [x] I have read the [Contributor Guide](../CONTRIBUTING.md)
* [x] I have read and agree to the [Code of
Conduct](../CODE_OF_CONDUCT.md)
* [x] I have added a description of my changes and why I'd like them
included in the section below

### Description of Changes

In #165 I inadvertently
added a `::{{closure}}` suffix to the name of all profile scopes created
with `profile_function!`, somehow not noticing it until now 🤦
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants