Fixed error in plot_histogram when number_to_keep is smaller that the number of keys by iuliazidaru · Pull Request #7481 · Qiskit/qiskit

iuliazidaru · 2022-01-05T15:01:18Z

Summary

Fix issue #7461 plot_histogram fails when given number_to_keep parameter
Fixes also related issue #6692, as it affects the same code.

Details and comments

When number_to_keep parameter is used and has a value less than the number of keys, the plot didn't work.
This PR considers also the case of displaying multiple result lists.

coveralls · 2022-01-05T15:36:35Z

Pull Request Test Coverage Report for Build 2454345820

18 of 20 (90.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.01%) to 84.415%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/visualization/counts_visualization.py	18	20	90.0%

Totals
Change from base Build 2454202826:	0.01%
Covered Lines:	54669
Relevant Lines:	64762

💛 - Coveralls

drewrisinger

Fix spelling errors
Clarify release note fixed code example
Add test to validate results from _plot_histogram_data()

releasenotes/notes/fix_plot_histogram_number-a0a4a023dfad3c70.yaml

qiskit/visualization/counts_visualization.py

releasenotes/notes/fix_plot_histogram_number-a0a4a023dfad3c70.yaml

test/python/visualization/test_plot_histogram.py

drewrisinger · 2022-01-05T16:56:45Z

See also implementation in #6698. That one (mine) stalled out b/c lack of review, and failing docs build on CI (master branch) at the time. Your implementation is cleaner, but that one includes a test against a reference image to check the bar alignment.

The plot_histogram tests are also missing skipUnless(HAS_MATPLOTLIB) (see e.g. https://github.com/Qiskit/qiskit-terra/blob/309292946906843f3c535c0a4465de534cf2ac69/test/python/visualization/test_circuit_drawer.py#L50-L57), should be straightforward to add to the pre-existing test as well as yours.

iuliazidaru · 2022-01-07T11:05:32Z

See also implementation in #6698. T
@drewrisinger What was the problem with the build? I think I have the same issue now and I don't know how to fix it...

drewrisinger · 2022-01-07T15:18:35Z

What was the problem with the build?

Several months ago, something with CI & the docs build wasn't working, and it was making the CI show as failing. I'm fairly sure the issue was also present on the main/master branch (according to the issues at the time, I don't remember which one). I don't remember anything more than that, it's been a while. The CI log should still be present on that PR if you want to look at it for comparison.

… into issue7461

iuliazidaru · 2022-01-14T11:29:20Z

@drewrisinger Thank you for your comments. Could you check again. I added a test for the multiple datasets issue.

drewrisinger

Diff LGTM, thanks for the hard work :)

Passes CI tests ✔️

drewrisinger · 2022-01-14T15:16:24Z

test/python/visualization/test_plot_histogram.py

+        with BytesIO() as img_buffer_ref:
+            figure_ref.savefig(img_buffer_ref, format="png")
+            img_buffer_ref.seek(0)


Someone from the qiskit team will need to check how saving reference images is supposed to work (is it supposed to be saved to a file?). I didn't figure that out on my previous pass at roughly this issue.

Looks good otherwise, took me a second to figure out what this test is actually doing but looks good now.

Yeah.. that.

We use snapshot testing for this kind of things (introduced in #4544). However, it seems like test/python/visualization/test_plot_histogram.py is not using that method and fixing that is beyond this PR. Do you think it make sense to have this on hold until that is fixed?

drewrisinger · 2022-01-14T15:19:43Z

test/python/visualization/test_plot_histogram.py

+            with BytesIO() as img_buffer:
+                figure_truncated.savefig(img_buffer, format="png")
+                img_buffer.seek(0)
+                self.assertImagesAreEqual(Image.open(img_buffer_ref), Image.open(img_buffer), 0.2)


I'm not sure about this 0.2 threshold. I don't have a good reference for how big this should be.

drewrisinger · 2022-01-14T15:22:00Z

qiskit/visualization/counts_visualization.py

    all_inds = []
+    # if multiple executions, we consider number_to_keep for each execution
+    # and this may result in more than number_to_keep slots
+    multiple_exec_keys_dict = OrderedDict()


You're using an OrderedDict here. I'm not quite sure if it's relevant, but have you checked that this works if the keys are in different orders in different dictionaries?

@drewrisinger you're right. No need for OrderedDict here.

OrderedDict is essentially the default in python 3.7+ anyways.

iuliazidaru · 2022-01-21T13:44:56Z

@jakelishman Could you review this PR as well? Thank you!

1ucian0

It looks good in general. Just two potentially blocking comments:

The snapshot testing
Maybe adding a release note about the fix of #6692 ?

thanks!

1ucian0 · 2022-02-22T10:40:16Z

qiskit/visualization/counts_visualization.py

    all_inds = []
+    # if multiple executions, we consider number_to_keep for each execution
+    # and this may result in more than number_to_keep slots
+    multiple_exec_keys_dict = defaultdict()


I do not fully get why this one needs to be a defaultdict. Isn't the regular dict enough in this context?

1ucian0 · 2022-02-22T10:55:20Z

qiskit/visualization/counts_visualization.py

+                        labels_dict[key] = 1
+                        values.append(0)


this block is the same as True branch of if number_to_keep is None: I think it is equivalent to write if number_to_keep is None or key in multiple_exec_keys_dict: in line 294.

1ucian0 · 2022-02-22T11:02:04Z

test/python/visualization/test_plot_histogram.py

+if HAS_MATPLOTLIB:
+    import matplotlib as mpl


consider using lazy testers for this (see https://github.com/Qiskit/qiskit-terra/blob/main/CONTRIBUTING.md#checking-for-optionals), introduced in https://github.com/Qiskit/qiskit-terra/pull/7525/files

1ucian0 · 2022-02-22T11:04:24Z

test/python/visualization/test_plot_histogram.py

+        with BytesIO() as img_buffer_ref:
+            figure_ref.savefig(img_buffer_ref, format="png")
+            img_buffer_ref.seek(0)


Yeah.. that.

We use snapshot testing for this kind of things (introduced in #4544). However, it seems like test/python/visualization/test_plot_histogram.py is not using that method and fixing that is beyond this PR. Do you think it make sense to have this on hold until that is fixed?

jakelishman

I'm sorry I had disappeared for so long - I've been busy writing my thesis at the same time, and my notifications got all overwhelmed.

I pushed up a couple of commits to fix up some minor things (documentation, release note, etc), but I have a larger problem. I tried just to do some simple two-dataset plots, and it doesn't show the right thing:

from qiskit.visualization import plot_histogram
data = [
    {"00": 3, "01": 5, "10": 6, "11": 12},
    {"00": 5, "01": 7, "10": 6, "11": 12},
]
plot_histogram(data, number_to_keep=2)

produces this plot:

but I'd expect it to show the bars 10 and 11 in the first data set, and 01 and 11 in the second data set.

The handling in _plot_histogram_data is rather confusing to me - it loops through all the data multiple times, and applies most_common twice as well. The multiple_exec_keys_dict is populated only with the value "1", but these are never read (perhaps it should just be a set?), but even then, I don't really understand its purpose. I would really expect that the only processing that needs to be done should happen at the very top of plot_histogram, such as:

def _keep_largest_items(execution, number_to_keep):
    """Keep only the largest values in a dictionary, and sum the rest into a new key 'rest'."""
    sorted_counts = sorted(execution.items(), key=lambda p: p[1])
    rest = sum(count for key, count in sorted_counts[:-number_to_keep])
    return dict(sorted_counts[-number_to_keep:], rest=rest)

def _unify_labels(data):
    """Make all dictionaries in data have the same set of keys, using 0 for missing values."""
    data = tuple(data)
    all_labels = set().union(*(execution.keys() for execution in data))
    base = {label: 0 for label in all_labels}
    out = []
    for execution in data:
        new_execution = base.copy()
        new_execution.update(execution)
        out.append(new_execution)
    return out

def plot_histogram(...):
    # ... normal setup ...
    if isinstance(data, dict):
        data = [data]
    if number_to_keep is not None:
        data = _unify_labels(_keep_largest_items(execution, number_to_keep) for execution in data)

    # continue with normal processing

That's a much more logical approach - number_to_keep modifies the data we should consider when performing all other metrics. In this model, all the additional handling of labels in _plot_histogram_data would be totally unnecessary - it wouldn't need to take number_to_keep as an argument any more, nor return a modified set of labels.

I didn't test that code so it probably needs tweaking, but I think it's logically much simpler, and should avoid the bugs in the present version.

It would be good to have testing against some reference images for this function - then a reviewer would be able to see that the outcomes look correct just by looking at the test cases, without running their own examples (which is what we look for in a test suite).

iuliazidaru · 2022-03-04T15:07:58Z

@jakelishman Thank you for the review!
I changed an if yesterday, let me check if that affected the code or there is other issue.

iuliazidaru · 2022-03-24T12:17:18Z

@1ucian0 Please check my implementation of snapshot testing/
@jakelishman Could you review again.
Thank you!

1ucian0

LGTM! Thanks @iuliazidaru

Any further comment @jakelishman ?

jakelishman

Sorry again...

I'm not 100% convinced that the output for multiple data sets with number_to_keep is non-zero is the least surprising/confusing thing we could do (it seems to suggest that certain keys in data sets are 0 when in reality they're just accounted for in the "rest" bar), but I don't really have a suggestion for how to do it better, and I've let this PR drag on far too long. If somebody has a better way, they can open a new issue.

Thanks for sticking with us, despite my terminal slowness @iuliazidaru.

plot_histogram fails when given number_to_keep parameter Qiskit#7461

a7d973a

iuliazidaru requested review from a team, nkanazawa1989 and nonhermitian as code owners January 5, 2022 15:01

iuliazidaru changed the title ~~plot_histogram fails when given number_to_keep parameter #7461~~ plot_histogram fails when given number_to_keep parameter Jan 5, 2022

iuliazidaru mentioned this pull request Jan 5, 2022

[Visualization]: plot_histogram() mis-pairs states & labels when comparing data with number_to_keep=int #6692

Closed

Fix style

05a9767

drewrisinger suggested changes Jan 5, 2022

View reviewed changes

iuliazidaru and others added 5 commits January 6, 2022 15:02

fix review comments

877322f

remove temporary the release notes

ea6321a

add release notes

44f5daa

Reformat release notes

94ae6d2

Merge branch 'main' into issue7461

8ab7d91

iuliazidaru added 6 commits January 10, 2022 12:56

Temp - remove code from release note

5a1abe7

Merge branch 'issue7461' of https://github.com/iuliazidaru/qiskit-terra…

88dd944

… into issue7461

Add code example in release note.

b997c9a

Add code example in release note.

4dd9589

Add code example in release note.

2705650

add test for multiple executions display

fcf7025

drewrisinger previously approved these changes Jan 14, 2022

View reviewed changes

change dictionary from OderedDict to defacultdict

30583fe

iuliazidaru dismissed drewrisinger’s stale review via 30583fe January 19, 2022 09:15

1ucian0 self-assigned this Feb 14, 2022

1ucian0 changed the title ~~plot_histogram fails when given number_to_keep parameter~~ Fixed error in plot_histogram when number_to_keep is smaller that the number of keys Feb 22, 2022

1ucian0 requested changes Feb 22, 2022

View reviewed changes

iuliazidaru and others added 7 commits February 23, 2022 17:30

Merge branch 'main' into issue7461

993d686

Merge branch 'main' into issue7461

f636ac5

small fixes

dbbc4f2

Correct use of optional tests

b8fb075

Reword release note

6337b6c

Improve documentation of number_to_keep

25545c8

Fix crash on distance measures

847c6a6

jakelishman requested changes Mar 4, 2022

View reviewed changes

Merge branch 'main' into issue7461

79b7e01

iuliazidaru and others added 3 commits March 4, 2022 17:28

Revert if.

e4ade96

Merge branch 'main' into issue7461

bfef156

refactor code & add test

235a9d6

1ucian0 reviewed Jun 7, 2022

View reviewed changes

1ucian0 approved these changes Jun 7, 2022

View reviewed changes

jakelishman approved these changes Jun 7, 2022

View reviewed changes

jakelishman added Changelog: Fixed Add a "Fixed" entry in the GitHub Release changelog. automerge labels Jun 7, 2022

Merge branch 'main' into issue7461

d54af7b

mergify bot merged commit 1658bb4 into Qiskit:main Jun 7, 2022

jakelishman mentioned this pull request Mar 9, 2023

visual testing needs an update (and maybe some regressions?) #9760

Closed

4 tasks

Conversation

iuliazidaru commented Jan 5, 2022 • edited by 1ucian0 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details and comments

Uh oh!

coveralls commented Jan 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 2454345820

💛 - Coveralls

Uh oh!

drewrisinger left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drewrisinger commented Jan 5, 2022

Uh oh!

iuliazidaru commented Jan 7, 2022

Uh oh!

drewrisinger commented Jan 7, 2022

Uh oh!

iuliazidaru commented Jan 14, 2022

Uh oh!

drewrisinger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iuliazidaru commented Jan 21, 2022

Uh oh!

1ucian0 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jakelishman left a comment

Choose a reason for hiding this comment

Uh oh!

iuliazidaru commented Mar 4, 2022

Uh oh!

iuliazidaru commented Mar 24, 2022

Uh oh!

1ucian0 left a comment

Choose a reason for hiding this comment

Uh oh!

jakelishman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

iuliazidaru commented Jan 5, 2022 •

edited by 1ucian0

Loading

coveralls commented Jan 5, 2022 •

edited

Loading

drewrisinger left a comment •

edited

Loading