Skip to content

Conversation

@dcreager
Copy link
Member

@dcreager dcreager commented Jan 24, 2025

This is a follow-up to #15702 that hopefully claws back the 1% performance regression. Assuming it works, the trick is to iterate over the constraints vectors via mut reference (aka a single pointer), so that we're not copying BitSets into and out of the zip tuples as we iterate. We use std::mem::take as a poor-man's move constructor only at the very end, when we're ready to emplace it into the result. (C++ idioms intended! 😄)

With local testing via hyperfine, I'm seeing this be 1-3% faster than main most of the time — though a small number of runs (1 in 10, maybe?) are a wash or have main faster. Fingers crossed to see what codspeed says!

@dcreager
Copy link
Member Author

Fingers crossed to see what codspeed says!

2% faster according to codspeed! 🎉 🌮

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

// fields are sets/vecs with the same length.
let a = (a.live_bindings.iter())
.zip(a.constraints)
.zip(a.constraints.iter_mut())
Copy link
Member

@MichaReiser MichaReiser Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here would probably be useful. It wasn't immediately clear to me why we use a &mut here even with your explanation. Should we do the same in the other merge method and for `visibility_constraints?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't be as beneficial for visibility_constraints, since those are already smaller. But I can check! (And I will add the comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep as expected, it was a wash doing the same thing to visibility_constraints. (Those are 32-bit IDs, which are cheap to copy, and not 24-byte BitSets.)

@dcreager dcreager merged commit 5a9d71a into main Jan 24, 2025
21 checks passed
@dcreager dcreager deleted the dcreager/merge-speedup branch January 24, 2025 21:07
@carljm
Copy link
Contributor

carljm commented Jan 24, 2025

Amazing, thank you!!

dcreager added a commit that referenced this pull request Jan 27, 2025
* main:
  Run `cargo update` (#15769)
  [red-knot] Document public symbol type inferece (#15766)
  Update dawidd6/action-download-artifact action to v8 (#15760)
  Update NPM Development dependencies (#15758)
  Update pre-commit dependencies (#15756)
  Update dependency ruff to v0.9.3 (#15755)
  Update dependency mdformat-mkdocs to v4.1.2 (#15754)
  Update Rust crate uuid to v1.12.1 (#15753)
  Update Rust crate unicode-ident to v1.0.15 (#15752)
  Fix docstring in ruff_annotate_snippets (#15748)
  Update Rust crate insta to v1.42.1 (#15751)
  Update Rust crate clap to v4.5.27 (#15750)
  Add references to `trio.run_process` and `anyio.run_process` (#15761)
  [`ruff`] Do not emit diagnostic when all arguments to `zip()` are variadic (`RUF058`) (#15744)
  [red-knot] Ensure differently ordered unions are considered equivalent when they appear inside tuples inside top-level intersections (#15743)
  [red-knot] Ensure differently ordered unions and intersections are understood as equivalent even inside arbitrarily nested tuples (#15740)
  [red-knot] Promote the `all_type_pairs_are_assignable_to_their_union` property test to stable (#15739)
  [`pylint`] Do not trigger `PLR6201` on empty collections (#15732)
  Improve the file watching failure error message (#15728)
  Speed symbol state merging back up (#15731)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants