Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the TracePoint callback #24

Merged
merged 1 commit into from
Mar 18, 2019
Merged

Optimize the TracePoint callback #24

merged 1 commit into from
Mar 18, 2019

Conversation

casperisfine
Copy link
Contributor

@casperisfine casperisfine commented Mar 18, 2019

Context

While experimenting with Zeitwerk on an existing Rails app, one test became extremely slow (I'm not even sure how long it takes as it never completed, but I estimate it at over 15 minutes, from ~3s previously).

So I decided to profile that test (or rather a subset of what it's doing) and got this:

$ stackprof tmp/stackprof-stripe-wall-.dump 
==================================
  Mode: wall(1000)
  Samples: 6 (83.78% miss rate)
  GC: 0 (0.00%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
         5  (83.3%)           5  (83.3%)     block in <module:ExplicitNamespace>
         1  (16.7%)           1  (16.7%)     Set#each
         5  (83.3%)           0   (0.0%)     Stripe::StripeObject#metaclass
         1  (16.7%)           0   (0.0%)     Stripe::StripeObject#remove_accessors
         6 (100.0%)           0   (0.0%)     Stripe::StripeObject#initialize_from
         6 (100.0%)           0   (0.0%)     Stripe::StripeObject.construct_from
         6 (100.0%)           0   (0.0%)     block (2 levels) in <class:StripeParserTest>
         6 (100.0%)           0   (0.0%)     block in <class:StripeParserTest>

So basically what is happening is that the Stripe gem (but could be any other dependency really, I'm not particularly pointing fingers here) is defining methods on instances, which lazily creates singleton classes.

These singleton classes are spamming the TracePoint callback so much that everything slows down to a crawl.

The patch

I tried making the callback bail out as soon as possible when it's clear the class is not relevant to Zeitwerk (i.e. is a singleton_class).

Unfortunately it's still way too slow, so I'll have to figure something else, but I suppose an optimization is always good to take.

@fxn what do you think? Also if you have other ideas how to solve this issue I'm all ears.

cc @rafaelfranca @Edouard-chin

@casperisfine
Copy link
Contributor Author

Actually, I tested with next if event.class.singleton_class? and somehow it's much faster than next unless event.class.name.

It's now fast enough that the test complete more or less as fast as before Zeitwerk.

@fxn
Copy link
Owner

fxn commented Mar 18, 2019

Absolutely, computing the hash code is expensive if called too much, so better short-circuit whatever we can, specially if the test is cheap.

What do you think about testing event.self.name.nil?? That would cover singleton classes as well as anonymous classes and modules.

Why does the patch rewrite event.self.name as name?

@casperisfine
Copy link
Contributor Author

Why does the patch rewrites event.self.name as name?

Oops. That a leftover from the previous patch version I'll fix that in a sec and answer your questions

@fxn
Copy link
Owner

fxn commented Mar 18, 2019

Good!

break unless event.self.name

is equivalent, and avoids the extra nil? call, which in this case is not needed because we have either a string or nil.

@casperisfine
Copy link
Contributor Author

What do you think about testing event.self.name.nil?? That would cover singleton classes as well as anonymous classes and modules.

So that is what I was doing in the first patch version. However it didn't make much difference performance wise. Turns out Module#name is about 1000X slower than Class#singleton_class?.

I'm currently digging in MRI to see why exactly, however I have a few intuitions as why.

That being said we can probably combine both patches. First discard singleton_class? as it's very cheap, and then as a second measure discard name.nil? since we'll be accessing the name anyway.

@fxn
Copy link
Owner

fxn commented Mar 18, 2019

Sounds good!

@casperisfine
Copy link
Contributor Author

Ok, so here's why Module#name is slow:

First it instantiate a new sting every time (already knew that): https://github.com/ruby/ruby/blob/28f619e301cf5f266f38fe764c4018b7fabc77cb/variable.c#L242-L250 Not the end of the world, but it means we definitely want to hold onto that name otherwise we'll generate lots of useless allocations.

Then classname() is quite a big process: https://github.com/ruby/ruby/blob/28f619e301cf5f266f38fe764c4018b7fabc77cb/variable.c#L185-L233

Basically the name of a Module, is not one of it's properties, every time you ask for it, MRI is going to lookup the module inside the constants table cache.

And if not found it's going to find it the hard way: https://github.com/ruby/ruby/blob/28f619e301cf5f266f38fe764c4018b7fabc77cb/variable.c#L151-L183

That probably explain why it's extra expensive for nameless singleton_class.

@casperisfine
Copy link
Contributor Author

@fxn I updated the patch again.

@fxn
Copy link
Owner

fxn commented Mar 18, 2019

Just a couple of minor details:

MRI allocate

would be

MRI allocates

On the other hand, I think I'd prefer break instead of next to short-circuit. While both may result in the same behavior, I find break to be more intention-revealing here.

@casperisfine
Copy link
Contributor Author

@fxn fixed both!

@fxn
Copy link
Owner

fxn commented Mar 18, 2019

🙌

@fxn fxn merged commit a1c68e3 into fxn:master Mar 18, 2019
@casperisfine
Copy link
Contributor Author

Thanks for the fast merge ❤️

@casperisfine
Copy link
Contributor Author

Wait. I'm getting some:




LocalJumpError: break from proc-closure
  | /tmp/bundle/ruby/2.5.0/bundler/gems/zeitwerk-a1c68e3c8aa0/lib/zeitwerk/explicit_namespace.rb:62:in `block in <module:ExplicitNamespace>'

now, not sure what's going on

@casperisfine
Copy link
Contributor Author

Nevermind, I bet the test suite just never encounter these cases. I'll submit a second PR with two new test cases.

@fxn
Copy link
Owner

fxn commented Mar 19, 2019

I have just published 1.4.0. It's late here, I'll adapt the Rails integration soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants