Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use scope tree depths to speed up nearest_common_ancestor. #51394

Merged
merged 1 commit into from
Jun 9, 2018

Conversation

nnethercote
Copy link
Contributor

This patch adds depth markings to all entries in the ScopeTree's
parent_map. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:

  • If one scope has a greater depth than the other, the deeper scope is
    moved upward until they are at equal depths.

  • Then we move the two scopes upward in lockstep until they match.

This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.

Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.

This change speeds up runs of several rust-perf benchmarks, the best by
6%.

A selection of the bigger improvements:

clap-rs-check
        avg: -2.6%      min: -6.6%      max: 0.0%
syn-check
        avg: -2.2%      min: -5.0%      max: 0.0%
style-servo-check
        avg: -2.9%?     min: -4.8%?     max: 0.0%?
cargo-check
        avg: -1.3%      min: -2.8%      max: 0.0%
sentry-cli-check
        avg: -1.0%      min: -2.1%      max: 0.0%
webrender-check
        avg: -0.9%      min: -2.0%      max: 0.0%
style-servo
        avg: -0.9%?     min: -1.8%?     max: -0.0%?
ripgrep-check
        avg: -0.7%      min: -1.8%      max: 0.1%
clap-rs
        avg: -0.9%      min: -1.6%      max: -0.2%
regex-check
        avg: -0.2%      min: -1.3%      max: 0.1%
syn
        avg: -0.6%      min: -1.3%      max: 0.1%
hyper-check
        avg: -0.5%      min: -1.1%      max: 0.0%

The idea came from multiple commenters on my blog and on Reddit. Thank you!

r? @nikomatsakis

This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:

- If one scope has a greater depth than the other, the deeper scope is
  moved upward until they are at equal depths.

- Then we move the two scopes upward in lockstep until they match.

This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.

Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.

This change speeds up runs of several rust-perf benchmarks, the best by
6%.
@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 6, 2018
@nikomatsakis
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented Jun 6, 2018

📌 Commit 5c36e01 has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 6, 2018
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 6, 2018
Use scope tree depths to speed up `nearest_common_ancestor`.

This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:

- If one scope has a greater depth than the other, the deeper scope is
  moved upward until they are at equal depths.

- Then we move the two scopes upward in lockstep until they match.

This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.

Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.

This change speeds up runs of several rust-perf benchmarks, the best by
6%.

A selection of the bigger improvements:
```
clap-rs-check
        avg: -2.6%      min: -6.6%      max: 0.0%
syn-check
        avg: -2.2%      min: -5.0%      max: 0.0%
style-servo-check
        avg: -2.9%?     min: -4.8%?     max: 0.0%?
cargo-check
        avg: -1.3%      min: -2.8%      max: 0.0%
sentry-cli-check
        avg: -1.0%      min: -2.1%      max: 0.0%
webrender-check
        avg: -0.9%      min: -2.0%      max: 0.0%
style-servo
        avg: -0.9%?     min: -1.8%?     max: -0.0%?
ripgrep-check
        avg: -0.7%      min: -1.8%      max: 0.1%
clap-rs
        avg: -0.9%      min: -1.6%      max: -0.2%
regex-check
        avg: -0.2%      min: -1.3%      max: 0.1%
syn
        avg: -0.6%      min: -1.3%      max: 0.1%
hyper-check
        avg: -0.5%      min: -1.1%      max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!

r? @nikomatsakis
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 7, 2018
Use scope tree depths to speed up `nearest_common_ancestor`.

This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:

- If one scope has a greater depth than the other, the deeper scope is
  moved upward until they are at equal depths.

- Then we move the two scopes upward in lockstep until they match.

This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.

Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.

This change speeds up runs of several rust-perf benchmarks, the best by
6%.

A selection of the bigger improvements:
```
clap-rs-check
        avg: -2.6%      min: -6.6%      max: 0.0%
syn-check
        avg: -2.2%      min: -5.0%      max: 0.0%
style-servo-check
        avg: -2.9%?     min: -4.8%?     max: 0.0%?
cargo-check
        avg: -1.3%      min: -2.8%      max: 0.0%
sentry-cli-check
        avg: -1.0%      min: -2.1%      max: 0.0%
webrender-check
        avg: -0.9%      min: -2.0%      max: 0.0%
style-servo
        avg: -0.9%?     min: -1.8%?     max: -0.0%?
ripgrep-check
        avg: -0.7%      min: -1.8%      max: 0.1%
clap-rs
        avg: -0.9%      min: -1.6%      max: -0.2%
regex-check
        avg: -0.2%      min: -1.3%      max: 0.1%
syn
        avg: -0.6%      min: -1.3%      max: 0.1%
hyper-check
        avg: -0.5%      min: -1.1%      max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!

r? @nikomatsakis
@bors
Copy link
Contributor

bors commented Jun 7, 2018

⌛ Testing commit 5c36e01 with merge a9eb1eb...

bors added a commit that referenced this pull request Jun 7, 2018
Use scope tree depths to speed up `nearest_common_ancestor`.

This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:

- If one scope has a greater depth than the other, the deeper scope is
  moved upward until they are at equal depths.

- Then we move the two scopes upward in lockstep until they match.

This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.

Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.

This change speeds up runs of several rust-perf benchmarks, the best by
6%.

A selection of the bigger improvements:
```
clap-rs-check
        avg: -2.6%      min: -6.6%      max: 0.0%
syn-check
        avg: -2.2%      min: -5.0%      max: 0.0%
style-servo-check
        avg: -2.9%?     min: -4.8%?     max: 0.0%?
cargo-check
        avg: -1.3%      min: -2.8%      max: 0.0%
sentry-cli-check
        avg: -1.0%      min: -2.1%      max: 0.0%
webrender-check
        avg: -0.9%      min: -2.0%      max: 0.0%
style-servo
        avg: -0.9%?     min: -1.8%?     max: -0.0%?
ripgrep-check
        avg: -0.7%      min: -1.8%      max: 0.1%
clap-rs
        avg: -0.9%      min: -1.6%      max: -0.2%
regex-check
        avg: -0.2%      min: -1.3%      max: 0.1%
syn
        avg: -0.6%      min: -1.3%      max: 0.1%
hyper-check
        avg: -0.5%      min: -1.1%      max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!

r? @nikomatsakis
@bors
Copy link
Contributor

bors commented Jun 7, 2018

💔 Test failed - status-travis

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 7, 2018
@rust-highfive
Copy link
Collaborator

The job dist-x86_64-freebsd of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
Preparing deploy
travis_fold:end:dpl.2
travis_fold:start:dpl.3
Deploying application
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rustfmt-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-analysis-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-std-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-src-nightly.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-analysis-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rustc-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rustfmt-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-src-nightly.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/cargo-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/cargo-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rustc-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-std-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
/home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/multipart_file_uploader.rb:81:in `abort_upload': multipart upload failed: execution expired (Aws::S3::MultipartUploadError)
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/multipart_file_uploader.rb:70:in `upload_parts'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/multipart_file_uploader.rb:44:in `upload'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/file_uploader.rb:32:in `upload'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/object.rb:252:in `upload_file'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/dpl-s3-1.9.7/lib/dpl/provider/s3.rb:99:in `block (2 levels) in upload_multithreaded'
failed to deploy

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Collaborator

The job dist-x86_64-freebsd of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/cargo-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/cargo-nightly-x86_64-unknown-freebsd.tar.gz" with {:content_type=>"application/gzip", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rustc-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
uploading "a9eb1eb00a7d566d95c9d323cc6f07068824cc9b/rust-std-nightly-x86_64-unknown-freebsd.tar.xz" with {:content_type=>"application/x-xz", :acl=>"public-read"}
/home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/multipart_file_uploader.rb:81:in `abort_upload': multipart upload failed: execution expired (Aws::S3::MultipartUploadError)
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/multipart_file_uploader.rb:70:in `upload_parts'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/multipart_file_uploader.rb:44:in `upload'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/file_uploader.rb:32:in `upload'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/aws-sdk-resources-2.11.63/lib/aws-sdk-resources/services/s3/object.rb:252:in `upload_file'
 from /home/travis/.rvm/gems/ruby-2.2.7/gems/dpl-s3-1.9.7/lib/dpl/provider/s3.rb:99:in `block (2 levels) in upload_multithreaded'
failed to deploy

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@nnethercote
Copy link
Contributor Author

Looks like an infra failure.

@Mark-Simulacrum
Copy link
Member

@bors retry - network failure

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2018
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Jun 8, 2018
Use scope tree depths to speed up `nearest_common_ancestor`.

This patch adds depth markings to all entries in the `ScopeTree`'s
`parent_map`. This change increases memory usage somewhat, but permits a
much faster algorithm to be used:

- If one scope has a greater depth than the other, the deeper scope is
  moved upward until they are at equal depths.

- Then we move the two scopes upward in lockstep until they match.

This avoids the need to keep track of which scopes have already been
seen, which was the major part of the cost of the old algorithm. It also
reduces the number of child-to-parent moves (which are hash table
lookups) when the scopes start at different levels, because it never
goes past the nearest common ancestor the way the old algorithm did.

Finally, the case where one of the scopes is the root is now handled in
advance, because that is moderately common and lets us skip everything.

This change speeds up runs of several rust-perf benchmarks, the best by
6%.

A selection of the bigger improvements:
```
clap-rs-check
        avg: -2.6%      min: -6.6%      max: 0.0%
syn-check
        avg: -2.2%      min: -5.0%      max: 0.0%
style-servo-check
        avg: -2.9%?     min: -4.8%?     max: 0.0%?
cargo-check
        avg: -1.3%      min: -2.8%      max: 0.0%
sentry-cli-check
        avg: -1.0%      min: -2.1%      max: 0.0%
webrender-check
        avg: -0.9%      min: -2.0%      max: 0.0%
style-servo
        avg: -0.9%?     min: -1.8%?     max: -0.0%?
ripgrep-check
        avg: -0.7%      min: -1.8%      max: 0.1%
clap-rs
        avg: -0.9%      min: -1.6%      max: -0.2%
regex-check
        avg: -0.2%      min: -1.3%      max: 0.1%
syn
        avg: -0.6%      min: -1.3%      max: 0.1%
hyper-check
        avg: -0.5%      min: -1.1%      max: 0.0%
```
The idea came from multiple commenters on my blog and on Reddit. Thank you!

r? @nikomatsakis
bors added a commit that referenced this pull request Jun 8, 2018
Rollup of 13 pull requests

Successful merges:

 - #50143 (Add deprecation lint for duplicated `macro_export`s)
 - #51099 (Fix Issue 38777)
 - #51276 (Dedup auto traits in trait objects.)
 - #51298 (Stabilize unit tests with non-`()` return type)
 - #51360 (Suggest parentheses when a struct literal needs them)
 - #51391 (Use spans pointing at the inside of a rustdoc attribute)
 - #51394 (Use scope tree depths to speed up `nearest_common_ancestor`.)
 - #51396 (Make the size of Option<NonZero*> a documented guarantee.)
 - #51401 (Warn on `repr` without hints)
 - #51412 (Avoid useless Vec clones in pending_obligations().)
 - #51427 (compiletest: autoremove duplicate .nll.* files (#51204))
 - #51436 (Do not require stage 2 compiler for rustdoc)
 - #51437 (rustbuild: generate full list of dependencies for metadata)

Failed merges:
@bors bors merged commit 5c36e01 into rust-lang:master Jun 9, 2018
@nnethercote nnethercote deleted the NCA-depths branch June 9, 2018 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants