Skip to content
This repository was archived by the owner on Aug 19, 2023. It is now read-only.

Add benchmarks for missing passes#1346

Merged
kdk merged 6 commits into
Qiskit:masterfrom
mtreinish:new-passes
Oct 18, 2021
Merged

Add benchmarks for missing passes#1346
kdk merged 6 commits into
Qiskit:masterfrom
mtreinish:new-passes

Conversation

@mtreinish
Copy link
Copy Markdown
Member

Summary

This commit adds runtime benchmarks to the passes and mapping_passes
benchmark modules to measure additional transpiler passes in isolation
that we didn't have coverage for previously. Primarily of importance
here are the basis translator and the optimize 1q decomposition passes
which are in the preset pass managers but didn't have standalone
benchmark coverage. One pass excluded here is the TemplateOptimization
pass which would have been good to add to the benchmark but it is
exceedingly slow and took well more than 5 min per iteration.

Details and comments

This commit adds runtime benchmarks to the passes and mapping_passes
benchmark modules to measure additional transpiler passes in isolation
that we didn't have coverage for previously. Primarily of importance
here are the basis translator and the optimize 1q decomposition passes
which are in the preset pass managers but didn't have standalone
benchmark coverage. One pass excluded here is the TemplateOptimization
pass which would have been good to add to the benchmark but it is
exceedingly slow and took well more than 5 min per iteration.
Comment thread test/benchmarks/mapping_passes.py Outdated
@mtreinish
Copy link
Copy Markdown
Member Author

I think we're going to have to cut back on the basis translator benchmark, it's too slow. It's been running for ~23min and still hasn't finished the first half (the cache warming unmeasured execution)

@mtreinish
Copy link
Copy Markdown
Member Author

Ok, I can just drop the non ibm basis from the benchmark for now to get this running nightly:

[100.00%] ··· ...pleBasisPassBenchmarks.time_basis_translator         6/9 failed
[100.00%] ··· ========== ======= ================================ =========
               n_qubits   depth            basis_gates                     
              ---------- ------- -------------------------------- ---------
                  5        1024            ['u', 'cx']              failed 
                  5        1024   ['rx', 'ry', 'rz', 'r', 'rxx']    failed 
                  5        1024   ['rz', 'x', 'sx', 'cx', 'id']    329±3ms 
                  14       1024            ['u', 'cx']              failed 
                  14       1024   ['rx', 'ry', 'rz', 'r', 'rxx']    failed 
                  14       1024   ['rz', 'x', 'sx', 'cx', 'id']    839±3ms 
                  20       1024            ['u', 'cx']              failed 
                  20       1024   ['rx', 'ry', 'rz', 'r', 'rxx']    failed 
                  20       1024   ['rz', 'x', 'sx', 'cx', 'id']    1.12±0s 
              ========== ======= ================================ =========

But, that is a huge disparity because the timeout is set to 300 sec and all the failures are timeouts. @kdk can you think of some reason it would be so much slower for those basis sets?

@mtreinish
Copy link
Copy Markdown
Member Author

@georgios-ts suggested the issue with the timeouts might be because the those basis were missing id. Testing locally adding id fixes this and the benchmark runs quite quickly:

[100.00%] ··· ...pleBasisPassBenchmarks.time_basis_translator                 ok
[100.00%] ··· ========== ======= ====================================== ============
               n_qubits   depth               basis_gates                           
              ---------- ------- -------------------------------------- ------------
                  5        1024            ['u', 'cx', 'id']             263±0.8ms  
                  5        1024   ['rx', 'ry', 'rz', 'r', 'rxx', 'id']    617±2ms   
                  5        1024      ['rz', 'x', 'sx', 'cx', 'id']       362±0.6ms  
                  14       1024            ['u', 'cx', 'id']              649±2ms   
                  14       1024   ['rx', 'ry', 'rz', 'r', 'rxx', 'id']    1.30±0s   
                  14       1024      ['rz', 'x', 'sx', 'cx', 'id']        941±2ms   
                  20       1024            ['u', 'cx', 'id']              908±10ms  
                  20       1024   ['rx', 'ry', 'rz', 'r', 'rxx', 'id']   1.73±0.01s 
                  20       1024      ['rz', 'x', 'sx', 'cx', 'id']       1.31±0.01s 
              ========== ======= ====================================== ============

I think this is a bug in the basis translator or the equivalence lib because it should be able to find an equivalent with the parameterized 1q gates for an id pretty easily.

mtreinish referenced this pull request in mtreinish/qiskit-core Oct 15, 2021
This commit adds a missing rules to the equivalence library for an
identity gate. Without this the basis translator has difficulty finding
a match if the basis does not include an id gate (see
Qiskit#1346)
@mtreinish
Copy link
Copy Markdown
Member Author

Qiskit/qiskit#7146 should fix the timeouts with the basis translator benchmarks

Copy link
Copy Markdown
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible to me.

Comment thread test/benchmarks/passes.py
# pylint: disable=no-member,invalid-name,missing-docstring,no-name-in-module
# pylint: disable=attribute-defined-outside-init,unsubscriptable-object
# pylint: disable=unused-wildcard-import,wildcard-import
# pylint: disable=unused-wildcard-import,wildcard-import,undefined-variable
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's pylint complaining about that requires this? By name, at least, it sounds like something we shouldn't be ignoring.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because pylint runs with the latest terra release and there are passes I'm adding benchmarks for here that are only on main. So pylint was complaining about an undefined variable for those passes.

Honestly I should just turn pylint off for the benchmarks and use flake8 and black only since it is serving a different use case than the other things being linted in this repo

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Yeah, given the huge number of exclusions here, stopping pylint on these files is probably a good idea.

Comment thread test/benchmarks/passes.py Outdated
@kdk
Copy link
Copy Markdown
Member

kdk commented Oct 15, 2021

@georgios-ts suggested the issue with the timeouts might be because the those basis were missing id. Testing locally adding id fixes this and the benchmark runs quite quickly:

Agree the timeouts are likely an instance of Qiskit/qiskit#5539 .

I think this is a bug in the basis translator or the equivalence lib because it should be able to find an equivalent with the parameterized 1q gates for an id pretty easily.

I think this is a little more complicated than it appears at first glance because IBM hardware has a non-standard (though deprecated) definition of id (see Qiskit/qiskit#6627 and Qiskit/qiskit#6460 ). These rules had been intentionally left out to avoid a case where the transpiler might inadvertently replace an id with the "equivalent" u3 (which would be twice as long).

I agree these are logically equivalences the library should have, but perhaps after 'id' is removed from IBM's basis gates and Qiskit's definition of IGate is reconsidered (e.g. from https://qiskit.org/documentation/stubs/qiskit.circuit.library.IGate.html#qiskit.circuit.library.IGate

Identity gate corresponds to a single-qubit gate wait cycle, and should not be optimized or unrolled (it is an opaque gate).

)

@mtreinish
Copy link
Copy Markdown
Member Author

mtreinish commented Oct 15, 2021

I agree these are logically equivalences the library should have, but perhaps after 'id' is removed from IBM's basis gates and Qiskit's definition of IGate is reconsidered (e.g. from https://qiskit.org/documentation/stubs/qiskit.circuit.library.IGate.html#qiskit.circuit.library.IGate

Identity gate corresponds to a single-qubit gate wait cycle, and should not be optimized or unrolled (it is an opaque gate).

)

Why does this matter for Qiskit/qiskit#7146 though? If IBMQ backends are using IGate implicitly as a single cycle delay that's unfortunate complexity but fine, it's in the basis set so the basis translator won't do anything with it. Where this would cause issues is if IBM backend's didn't report id as a basis gate then that relationship would be broken. I'm just not sure I see how adding id == u(0,0,0) to the equivalence lib will break the weird special case that the ibmq provider is trying to support since when would that rule be triggered the basis translator should ignore id as long as it's in the basis. I assume there is test coverage for this in terra because Qiskit/qiskit#7146 passes all the tests.

Co-authored-by: Jake Lishman <jake@binhbar.com>
@kdk
Copy link
Copy Markdown
Member

kdk commented Oct 18, 2021

Where this would cause issues is if IBM backend's didn't report id as a basis gate then that relationship would be broken.

I think this is exactly it. Following Qiskit/qiskit-ibmq-provider#976 (which currently warns on id usage and replaces with the equivalent delay), the IBM backends will at some point remove id as a basis_gate, at which the transpiler should fail with an unreachable basis or unknown instruction (notwithstanding Qiskit/qiskit#5539 ) rather than implement delays that are twice as long.

equivalence lib will break the weird special case that the ibmq provider is trying to support since

Agree that this is weird and non-intuitive behavior, but it is not at the moment IBMQ provider specific. Qiskit defines IdGate as delay(durationof(SXGate()) and not UnitaryGate([[1, 0], [0, 1]]) (which I think is what most would expect, given the name). Agree this should change in the future, but without an inconsistent intermediate state.

Copy link
Copy Markdown
Member

@kdk kdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, otherwise LGTM.

Comment thread test/benchmarks/passes.py Outdated
Co-authored-by: Kevin Krsulich <kevin@krsulich.net>
@kdk kdk merged commit 9246382 into Qiskit:master Oct 18, 2021
@mtreinish mtreinish deleted the new-passes branch October 18, 2021 22:59
jakelishman added a commit to jakelishman/qiskit-terra that referenced this pull request Aug 1, 2023
* Add benchmarks for missing passes

This commit adds runtime benchmarks to the passes and mapping_passes
benchmark modules to measure additional transpiler passes in isolation
that we didn't have coverage for previously. Primarily of importance
here are the basis translator and the optimize 1q decomposition passes
which are in the preset pass managers but didn't have standalone
benchmark coverage. One pass excluded here is the TemplateOptimization
pass which would have been good to add to the benchmark but it is
exceedingly slow and took well more than 5 min per iteration.

* Update test/benchmarks/mapping_passes.py

* Add collect multi q block benchmark

* Update test/benchmarks/passes.py

Co-authored-by: Jake Lishman <jake@binhbar.com>

* Update test/benchmarks/passes.py

Co-authored-by: Kevin Krsulich <kevin@krsulich.net>

Co-authored-by: Jake Lishman <jake@binhbar.com>
Co-authored-by: Kevin Krsulich <kevin@krsulich.net>
jakelishman added a commit to jakelishman/qiskit-terra that referenced this pull request Aug 11, 2023
* Add benchmarks for missing passes

This commit adds runtime benchmarks to the passes and mapping_passes
benchmark modules to measure additional transpiler passes in isolation
that we didn't have coverage for previously. Primarily of importance
here are the basis translator and the optimize 1q decomposition passes
which are in the preset pass managers but didn't have standalone
benchmark coverage. One pass excluded here is the TemplateOptimization
pass which would have been good to add to the benchmark but it is
exceedingly slow and took well more than 5 min per iteration.

* Update test/benchmarks/mapping_passes.py

* Add collect multi q block benchmark

* Update test/benchmarks/passes.py

Co-authored-by: Jake Lishman <jake@binhbar.com>

* Update test/benchmarks/passes.py

Co-authored-by: Kevin Krsulich <kevin@krsulich.net>

Co-authored-by: Jake Lishman <jake@binhbar.com>
Co-authored-by: Kevin Krsulich <kevin@krsulich.net>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants