Skip to content

Fix potential non-determinism in SabreSwap rust code#10740

Merged
jakelishman merged 1 commit into
Qiskit:mainfrom
mtreinish:the-sabre-of-deterministic-iteration
Aug 31, 2023
Merged

Fix potential non-determinism in SabreSwap rust code#10740
jakelishman merged 1 commit into
Qiskit:mainfrom
mtreinish:the-sabre-of-deterministic-iteration

Conversation

@mtreinish
Copy link
Copy Markdown
Member

Summary

In the SabreSwap rust module there were 2 potential sources of non-determinism that could cause variation in results even with a fixed seed, the ExtendedSet struct's nodes attribute and the decremented tracking when populating the extended set. Both were caused by the same root cause iterating over a HashMap object. Iteration order on a HashMap (or a HashSet) is dependent on the internal hash seed of the hasher and iteration order is explicitly not guaranteed. In the case of the decrementer it's unlikely to have any effect even if the iteration order is not guaranteed but it is switched to using an indexmap regarldess just out of best practice. But for the nodes attribute in the ExtendedSet struct there is a potential issue there because when the total_score() method is called the nodes are iterated over, the distance is looked up for each swap and then summed. As the distances are all floating point values the iteration order could result in different values being output. In both cases the hashbrown::HashMap<K, V> is changed to be an indexmap::IndexMap<K, V, ahash::RandomState> which will have deterministic iteration order (it uses insertion order).

Details and comments

In the SabreSwap rust module there were 2 potential sources of
non-determinism that could cause variation in results even with a fixed
seed, the ExtendedSet struct's nodes attribute and the decremented
tracking when populating the extended set. Both were caused by the same
root cause iterating over a HashMap object. Iteration order on a HashMap
(or a HashSet) is dependent on the internal hash seed of the hasher and
iteration order is explicitly not guaranteed. In the case of the
decrementer it's unlikely to have any effect even if the iteration order
is not guaranteed but it is switched to using an indexmap regarldess
just out of best practice. But for the nodes attribute in the
ExtendedSet struct there is a potential issue there because when the
total_score() method is called the nodes are iterated over, the distance
is looked up for each swap and then summed. As the distances are all
floating point values the iteration order could result in different
values being output. In both cases the `hashbrown::HashMap<K, V>` is
changed to be an `indexmap::IndexMap<K, V, ahash::RandomState>` which
will have deterministic iteration order (it uses insertion order).
@mtreinish mtreinish added Changelog: None Do not include in the GitHub Release changelog. Rust This PR or issue is related to Rust code in the repository labels Aug 30, 2023
@mtreinish mtreinish requested a review from a team as a code owner August 30, 2023 17:00
@mtreinish mtreinish changed the title Fix non-determinism in SabreSwap rust code Fix potential non-determinism in SabreSwap rust code Aug 30, 2023
@qiskit-bot
Copy link
Copy Markdown
Collaborator

One or more of the the following people are requested to review this:

Copy link
Copy Markdown
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if this actually fixes the non-determinism in the Sabre trials we were seeing? The floating-point argument in ExtendedSet::total_score thing certainly sounds like something we should squash - I'm not sure why I didn't see it at the time.

At any rate, all the changes look fine and good to me.

@mtreinish
Copy link
Copy Markdown
Member Author

mtreinish commented Aug 30, 2023

It's still running now (I should have trimmed the data set) but looking at the streaming results it still looks like there are fluctuations in the output swap count as the trial counts increase. I was optimistically hoping it would be this simple, but it looks like we'll have to do more digging.

@mtreinish
Copy link
Copy Markdown
Member Author

mtreinish commented Aug 31, 2023

The floating-point argument in ExtendedSet::total_score thing certainly sounds like something we should squash - I'm not sure why I didn't see it at the time.

TBH, it's probably pretty unlikely to be an issue in practice as the distance matrix should typically be small whole numbers as generated by rustworkx's distance_matrix() is just the number of edges in the shortest path between 2 nodes. It's more a theoretical issue than a practical one unless we change how we weight edges in the connectivity graph like what #9904 does.

@jakelishman jakelishman added this pull request to the merge queue Aug 31, 2023
Merged via the queue into Qiskit:main with commit 48cfab6 Aug 31, 2023
@mtreinish mtreinish deleted the the-sabre-of-deterministic-iteration branch August 31, 2023 11:07
SameerD-phys pushed a commit to SameerD-phys/qiskit-terra that referenced this pull request Sep 7, 2023
In the SabreSwap rust module there were 2 potential sources of
non-determinism that could cause variation in results even with a fixed
seed, the ExtendedSet struct's nodes attribute and the decremented
tracking when populating the extended set. Both were caused by the same
root cause iterating over a HashMap object. Iteration order on a HashMap
(or a HashSet) is dependent on the internal hash seed of the hasher and
iteration order is explicitly not guaranteed. In the case of the
decrementer it's unlikely to have any effect even if the iteration order
is not guaranteed but it is switched to using an indexmap regarldess
just out of best practice. But for the nodes attribute in the
ExtendedSet struct there is a potential issue there because when the
total_score() method is called the nodes are iterated over, the distance
is looked up for each swap and then summed. As the distances are all
floating point values the iteration order could result in different
values being output. In both cases the `hashbrown::HashMap<K, V>` is
changed to be an `indexmap::IndexMap<K, V, ahash::RandomState>` which
will have deterministic iteration order (it uses insertion order).
ElePT pushed a commit to ElePT/qiskit that referenced this pull request Sep 26, 2023
In the SabreSwap rust module there were 2 potential sources of
non-determinism that could cause variation in results even with a fixed
seed, the ExtendedSet struct's nodes attribute and the decremented
tracking when populating the extended set. Both were caused by the same
root cause iterating over a HashMap object. Iteration order on a HashMap
(or a HashSet) is dependent on the internal hash seed of the hasher and
iteration order is explicitly not guaranteed. In the case of the
decrementer it's unlikely to have any effect even if the iteration order
is not guaranteed but it is switched to using an indexmap regarldess
just out of best practice. But for the nodes attribute in the
ExtendedSet struct there is a potential issue there because when the
total_score() method is called the nodes are iterated over, the distance
is looked up for each swap and then summed. As the distances are all
floating point values the iteration order could result in different
values being output. In both cases the `hashbrown::HashMap<K, V>` is
changed to be an `indexmap::IndexMap<K, V, ahash::RandomState>` which
will have deterministic iteration order (it uses insertion order).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: None Do not include in the GitHub Release changelog. Rust This PR or issue is related to Rust code in the repository

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants