Update Sabre extended set on-the-fly#14912
Conversation
|
One or more of the following people are relevant to this code:
|
Pull Request Test Coverage Report for Build 16969145900Details
💛 - Coveralls |
2f15818 to
3e422ca
Compare
3e422ca to
6db435a
Compare
The previous Sabre extended set was just the "next N" 2q gates topologically on from the front layer, where Qiskit reliably used `N = 20` ever since its introduction. For small-width circuits (as were common when the original Sabre paper was written, and when it was first implemented in Qiskit), this could mean the extended set was reliably several layers deep. This could also be the case for star-like circuits. For the wider circuits in use now, at the 100q order of magnitude, the 20-gate limit reliably means that denser circuits cannot have their entire next layer considered by the lookahead set. This commit modifies the lookahead heuristic to be based specifically on layers. This regularises much of the structure of the heuristic with respect to circuit and target topology; we reliably "look ahead" by the same "distance" as far as routing is concerned. It comes with the additional benefits: - we can use the same `Layer` structure for both the front layer and the lookahead layers, which reduces the amount of scoring code - the lookahead score of a swap can now affect at most two gates per layer, just like the front-layer scoring, and we can do this statically without loops - we no longer risk "biasing" the lookahead heuristic in either case of long chains of dependent gates (e.g. a gate that has 10 predecessors weights the score the same as a gate with only 1) or wide circuits (some qubits have their next layer counted in the score, but others don't because the extended set reached capacity). - applying a swap to the lookahead now has a time complexity that is constant per layer, regardless of the number of gates stored in it, whereas previously it was proportional to the number of gates stored (and the implementation in the parent of this commit is proportional to the number of qubits in the circuit). This change alone is mostly a set up, which enables further computational complexity improvements by modifying the lookahead layers in place after a gate routes, rather than rebuilding them from scratch, and subsequently only updating _swap scores_ based on routing changes, rather than recalculating all from scratch.
Ever since Sabre was first introduced to Qiskit (in [1]), the extended set has been recalculated from the front layer, every time the front layer is updated. Now that the extended set is tracked as a series of layers, it is easily possible to keep the layer structure updated in the same way as the front layer, by tracking separate points in the topological iteration through the interaction graph. To speed up synchronisation of the layers during updates, a new outer tracker, `Layers`, now means that a removal can jump immediately to the correct layer, without having to test multiples. This outer structure co-operates with the individual layers to effect faster lookups and removals; rather than each layer storing its own `IndexMap`, they instead store only a `Vec`, and the outer `Layers` tracks _where_ in the `Vec` any given gate is. On removal, the layer returns which, if any, gate has moved to occupy the otherwise empty slot (since removals are by swap, not shift). This avoids all hash lookups in the layer structures; everything is simple contiguous indexed access. With the tracking done elsewhere, and with `VecMap` providing type-safe indexing, each individual `Layer` now stores its contained gates in terms of `[VirtualQubit; 2]`; this minimises the amount of data that needs to be modified on a swap, at the cost that actions that need to associate actual gates with the current physical qubits (like iterating over active swaps) now need access to the `NLayout`, rather than the layout being (implicitly) tracked by each `Layer` via `apply_swap`. This is a performance improvement; as more overhead is reduced, the cost of the hash lookups in swap application was ever more visible. [1]: fab61d2: Sabre layout and routing transpiler passes (Qiskitgh-4537)
6db435a to
38d0b74
Compare
alexanderivrii
left a comment
There was a problem hiding this comment.
So far I have taken a deeper look at layer.rs and only a superficial look at route.rs. The changes to the data structures in layer.rs are really cool, do you have any numbers on how much this improves performance?
Quick question: based on the PR summary, I expected most changes to be in layer.rs, could you clarify what is going on in routing.rs?
| /// physicals[0] == layer.other[physicals[1]]; | ||
| /// physicals[1] == layer.other[physicals[2]]; |
There was a problem hiding this comment.
| /// physicals[0] == layer.other[physicals[1]]; | |
| /// physicals[1] == layer.other[physicals[2]]; | |
| /// physicals[0] == layer.other[physicals[1]]; | |
| /// physicals[1] == layer.other[physicals[0]]; |
| /// The qubits that each stored gate in the layer is active on. For any given pair, all three | ||
| /// equalities hold: |
There was a problem hiding this comment.
That's "for any given pair in gates", right?
| /// Remove a node from the layer, given by position into the tracking vector. | ||
| /// | ||
| /// Returns the node that is now in this location in the tracking vector, if any. | ||
| #[must_use] |
There was a problem hiding this comment.
I did not know about the #[must_use] attribute before. Using it here means that something must be additional updated in the upstream code?
| if let Some(Location { layer, position }) = self.locations[node] { | ||
| if let Some(moved) = self.layers[layer as usize].remove(position, layout) { | ||
| self.locations[moved] = Some(Location { layer, position }); | ||
| } | ||
| } |
There was a problem hiding this comment.
Ok, I see why you made Layer::remove as a must_use.
Ever since Sabre was first introduced to Qiskit (in [1]), the extended set has been recalculated from the front layer, every time the front layer is updated. Now that the extended set is tracked as a series of layers, it is easily possible to keep the layer structure updated in the same way as the front layer, by tracking separate points in the topological iteration through the interaction graph.
To speed up synchronisation of the layers during updates, a new outer tracker,
Layers, now means that a removal can jump immediately to the correct layer, without having to test multiples. This outer structure co-operates with the individual layers to effect faster lookups and removals; rather than each layer storing its ownIndexMap, they instead store only aVec, and the outerLayerstracks where in theVecany given gate is. On removal, the layer returns which, if any, gate has moved to occupy the otherwise empty slot (since removals are by swap, not shift). This avoids all hash lookups in the layer structures; everything is simple contiguous indexed access.With the tracking done elsewhere, and with
VecMapproviding type-safe indexing, each individualLayernow stores its contained gates in terms of[VirtualQubit; 2]; this minimises the amount of data that needs to be modified on a swap, at the cost that actions that need to associate actual gates with the current physical qubits (like iterating over active swaps) now need access to theNLayout, rather than the layout being (implicitly) tracked by eachLayerviaapply_swap. This is a performance improvement; as more overhead is reduced, the cost of the hash lookups in swap application was ever more visible.[1]: fab61d2: Sabre layout and routing transpiler passes (gh-4537)