Skip to content

Leverage retworkx for apply_operation* inner loop#5267

Merged
mergify[bot] merged 9 commits into
Qiskit:masterfrom
mtreinish:retworkx-apply-operation-dagcircuit
Nov 25, 2020
Merged

Leverage retworkx for apply_operation* inner loop#5267
mergify[bot] merged 9 commits into
Qiskit:masterfrom
mtreinish:retworkx-apply-operation-dagcircuit

Conversation

@mtreinish
Copy link
Copy Markdown
Member

Summary

This commit migrates the inner loop for apply_operation_back and
apply_operation_front to leverage retworkx's
insert_node_between_multiple PyDiGraph method. This retworkx method
performs essentially the same loop internally but executes faster.

Details and comments

Depends on Qiskit/rustworkx#181 being included in
a release.

This commit migrates the inner loop for apply_operation_back and
apply_operation_front to leverage retworkx's
insert_node_between_multiple PyDiGraph method. This retworkx method
performs essentially the same loop internally but executes faster.

Depends on Qiskit/rustworkx#181 being included in
a release.
@mtreinish mtreinish requested a review from a team as a code owner October 20, 2020 22:07
@mtreinish mtreinish added on hold Can not fix yet performance labels Oct 20, 2020
@mtreinish
Copy link
Copy Markdown
Member Author

Just to show some initial numbers on this. It's showing a ~2X speed improvement for apply_operation_back() when running circuit_to_dag() on a random_circuit() with 65 qubits and a depth of 4096, For that test it goes from this on master:

Screenshot_2020-10-21_15-06-32

To this with this PR applied (and retworkx patched):

Screenshot_2020-10-21_15-06-06

The interesting things are after this PR the bottlenecks for apply operation back are a 3 way tie between calling __repr__ on Bit objects which is ~15% of the time spent on this cumulatively (I address this in #5272), the list comprehension to get the node_ids of the output nodes (at L391) which is ~16% of the function time, and calling _check_bitswhich is ~15% of the run time (which is spending most of it's time doing Bit.__eq__ and Register.__eq__).

I'm thinking that after this and #5272 we might be able to speed up the other 2 bottlenecks by changing the Bit.__eq__ and Register.__eq__ to do a single repr comparison since they'll be pre-generated and it's a single comparison. But, we'll have to benchmark and compare to see if that makes a big difference or not.

@mtreinish
Copy link
Copy Markdown
Member Author

mtreinish commented Oct 21, 2020

I added the __eq__ change to use _repr in #5272 as I mentioned in the last comment and this continues the process of speeding up apply_operation_back further. With this PR and #5272 together with the previous example apply_operation_back() takes cumulatively ~1.7 seconds (down from ~4.6 sec on master):

Screenshot_2020-10-21_16-39-53

@mtreinish mtreinish removed the on hold Can not fix yet label Nov 12, 2020
@mtreinish
Copy link
Copy Markdown
Member Author

retworkx 0.6.0 was released this morning, so this is no longer blocked.

@mtreinish mtreinish added this to the 0.17 milestone Nov 12, 2020
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Nov 16, 2020
In sabre_swap the comprehension in _score_heuristic() with the basic
heuristic starts to become a bottleneck as the size of the coupling map
increases. This probably becomes the top bottleneck in a transpilation
after Qiskit#5316, Qiskit#5183, Qiskit#5294, Qiskit#5267, and Qiskit#5272 are merged. This commit is
an attempt to try and mitigate that a bit by using numpy native
operations instead of a python comprehension in sum(). If this is
insufficient we'll likely have to either avoid doing a sum like this or
drop down to a lower level with cython or rust and operate on the array
directly to remove this bottleneck.
@mergify mergify Bot merged commit 4e03ad1 into Qiskit:master Nov 25, 2020
mergify Bot added a commit that referenced this pull request Nov 25, 2020
* Use numpy sum instead of comprehension in _score_heuristic

In sabre_swap the comprehension in _score_heuristic() with the basic
heuristic starts to become a bottleneck as the size of the coupling map
increases. This probably becomes the top bottleneck in a transpilation
after #5316, #5183, #5294, #5267, and #5272 are merged. This commit is
an attempt to try and mitigate that a bit by using numpy native
operations instead of a python comprehension in sum(). If this is
insufficient we'll likely have to either avoid doing a sum like this or
drop down to a lower level with cython or rust and operate on the array
directly to remove this bottleneck.

* Make distance matrix a coupling map property

* Fix tests

* Fix lint

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
@mtreinish mtreinish deleted the retworkx-apply-operation-dagcircuit branch November 26, 2020 15:52
@kdk kdk added the Changelog: None Do not include in the GitHub Release changelog. label Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: None Do not include in the GitHub Release changelog. performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants