-
Notifications
You must be signed in to change notification settings - Fork 134
Update computation of probability #424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 16 commits
8414ada
26b5b82
7a62da0
482378c
fa3537a
0f7a504
4c66e3a
1ddc994
e225ff1
3211c4d
3f64364
1e8750f
cd7238a
d05a3b0
f9a664a
10b00a0
ac5b44a
a6570a6
bcb62c5
b74c942
b85c159
eaa97de
a9a43c1
5484f99
b95d833
e534cea
c2dfbbe
3f61e3f
7e0efec
4290bc4
aca3d6b
f28c924
4ee4383
62f7d82
194568b
08efb3d
11f6c58
947f952
b7c6c06
5351a81
1ceee7e
7c8b142
8c12a8d
93276b7
9f3e946
da4bf5d
9fe8770
8291aa8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -481,6 +481,87 @@ def _population_error(self, counts_dict) -> Tuple[float, float]: | |
| return p_mean, np.sqrt(p_var) | ||
|
|
||
|
|
||
| class DirichletProbability(Probability): | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| r"""Compute probability and variance from count dictionary. | ||
|
|
||
| This node is a subtype of :py:class:`~qiskit_experiments.data_processing.nodes.Probability`, | ||
| in which variance is computed based on a binomial distribution at the risk of zero variance | ||
| at probability at either zero or one. | ||
|
|
||
| This node avoid such singularity by assuming Dirichlet distribution with Bayes update | ||
| taking a prior distribution. Namely, a mean value is replaced by a mode that represents | ||
| the most likely value to be sampled | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
|
|
||
| .. math:: | ||
|
|
||
| p = \frac{\alpha_i - 1}{\alpha_0 + K} | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
|
|
||
| where :math:`\alpha_i = f_i + \theta_i`, :math:`\alpha_0 = \sum_{i=1}^K \alpha_i`, | ||
| :math:`\theta_i` is the prior distribution and :math:`f_i` is | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| the count of :math:`i`-th element of :math:`K` dimensional vector. | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| The variance is computed by | ||
|
|
||
| .. math:: | ||
|
|
||
| v = \frac{E[x] (1 - E[x])}{\alpha_0 + 1} | ||
|
|
||
| where :math:`E[x] = \alpha_i / \alpha_0` is the mean value of the outcome of interest. | ||
| With a finite prior, this node always returns finite variance. | ||
| This saves us from the risk of unexpected zero division throughout the stack. | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| """ | ||
|
|
||
| def __init__( | ||
| self, | ||
| outcome: str = "1", | ||
| prior: Union[Dict[str, float], float] = 0.5, | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| validate: bool = True, | ||
| ): | ||
| """Initialize a counts to probability data conversion. | ||
|
|
||
| Args: | ||
| outcome: The bitstring for which to compute the probability which defaults to "1". | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After getting to here I think the whole distribution stuff in the into is a bit confusion now. The Dirichlet distribution if for computing a multi-variate probability distribution, so for example converting observed counts into an estimate of a probability distribution over all possible outcomes (2^n). If it is for a binary estimate this should be a beta distribution (which is the special case of 2-outcome dirichlet distribution). The way the description of this class is worded it should be something that converts a full counts dict into a full probabilities dict + std error dict since the bayesian update is for a full distribution on all outcomes, not a distribution for a single outcome. The difference between estimating individual probabilities as 2-outcome distributions vs all probabilities as 2^n outcome distribution will result in slight differences in variances and what they mean.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree the documentation for the option is confusing, i.e. it uses multivariable distribution but distribution defaults to 2-outcome. I updated the documentation and removed the default value in 194568b |
||
| prior: Prior distribution. This can be float value or dictionary keyed on | ||
| outcome bitstring. If n-bit label is provided in ``outcome``, the | ||
| dimension of the prior distribution, i.e. dictionary length, should be :math:`2^n`. | ||
| If a float value is applied, this applies a flat prior with the provided value. | ||
| By default, this assumes flat prior of 0.5 corresponding to the Jeffery's prior. | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| validate: If set to False the DataAction will not validate its input. | ||
|
|
||
| Raises: | ||
| DataProcessorError: When dimension of prior and expected parameter vector | ||
| doesn't match. | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| """ | ||
| self._dim = 2 ** len(outcome) | ||
| self._prior = prior | ||
| super().__init__(outcome=outcome, validate=validate) | ||
|
|
||
| if isinstance(prior, dict) and self._dim != len(prior): | ||
| raise DataProcessorError( | ||
| "Dimension of probability density function and prior distribution doesn't match." | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| ) | ||
|
|
||
| def _population_error(self, counts_dict) -> Tuple[float, float]: | ||
|
nkanazawa1989 marked this conversation as resolved.
Outdated
|
||
| """Helper method""" | ||
| shots = sum(counts_dict.values()) | ||
| freq = counts_dict.get(self._outcome, 0.0) | ||
|
|
||
| if isinstance(self._prior, dict): | ||
| alpha_i = freq + self._prior[self._outcome] | ||
| alpha_0 = sum([v + self._prior[k] for k, v in counts_dict.items()]) | ||
| else: | ||
| alpha_i = freq + self._prior | ||
| alpha_0 = shots + self._prior * self._dim | ||
|
|
||
| p_mean = alpha_i / alpha_0 | ||
| p_var = p_mean * (1 - p_mean) / (alpha_0 + 1) | ||
| mode = (alpha_i - 1) / (alpha_0 + self._dim) | ||
|
|
||
| # If outcome count is zero, mode becomes tiny negative value with prior < 1 | ||
| mode = max(0.0, mode) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a bit confused by the variable name. I though mode is the value that appears most often in the data. Since we have counts I would naively expect this to be 0 or 1. Should this be named
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just referred to the definition here. According to the definition, alpha_i > 0 and |
||
|
|
||
| return mode, np.sqrt(p_var) | ||
|
|
||
|
|
||
| class BasisExpectationValue(DataAction): | ||
| """Compute expectation value of measured basis from probability. | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -199,7 +199,8 @@ def _default_options(cls): | |
| """Return the default analysis options.""" | ||
| default_options = super()._default_options() | ||
| default_options.data_processor = dp.DataProcessor( | ||
| input_key="counts", data_actions=[dp.Probability("1"), dp.BasisExpectationValue()] | ||
| input_key="counts", | ||
| data_actions=[dp.DirichletProbability("1"), dp.BasisExpectationValue()], | ||
| ) | ||
| default_options.curve_plotter = "mpl_multiv_canvas" | ||
| default_options.xlabel = "Flat top width" | ||
|
|
@@ -266,21 +267,23 @@ def _generate_fit_guesses( | |
|
|
||
| guesses = defaultdict(list) | ||
| for control in (0, 1): | ||
| # start from Z oscillation | ||
| x_data = self._data(series_name=f"x|c={control}") | ||
| y_data = self._data(series_name=f"y|c={control}") | ||
| z_data = self._data(series_name=f"z|c={control}") | ||
|
|
||
| omega_xyz = [] | ||
| for data in (x_data, y_data, z_data): | ||
| ymin, ymax = np.percentile(data.y, [10, 90]) | ||
| if ymax - ymin < 0.2: | ||
| # oscillation amplitude might be almost zero, | ||
| # then exclude from average because of lower SNR | ||
| continue | ||
|
Comment on lines
+276
to
+280
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What prompted this change?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previously, I let the FFT module estimate a frequency regardless of the signal amplitude. In some test cases, especially when the signal is really weak, the guess module picks random frequency (because computed probability has been changed), and it hurts frequency guess of averaged oscillation frequency measured in x, y, z basis. So I decided to ignore the guess when SNR is likely low, i.e. it may pick some artifact. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like it! I wonder if using the actual SNR (min-max)/(sqrt(variance)) might be nice in case someone tries to fit a very small/off resonant rotation with this at some point, so that they can average a lot and still do it? Might be a separate PR
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is nice idea. What is the variance here? Is it the variance of y values |
||
| fft_freq = curve.guess.frequency(data.x, data.y) | ||
| # oscillation amplitude might be almost zero, then exclude from average | ||
| if fft_freq > 0: | ||
| omega_xyz.append(fft_freq) | ||
| omega_xyz.append(fft_freq) | ||
| if omega_xyz: | ||
| omega = 2 * np.pi * np.average(omega_xyz) | ||
| else: | ||
| omega = 0.0 | ||
| omega = 1e-3 | ||
|
|
||
| zmin, zmax = np.percentile(z_data.y, [10, 90]) | ||
| theta = np.arccos(np.sqrt((zmax - zmin) / 2)) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,8 +22,8 @@ | |
| from qiskit.quantum_info import Clifford | ||
| from qiskit.circuit import Gate | ||
|
|
||
| import qiskit_experiments.data_processing as dp | ||
| from qiskit_experiments.framework import BaseExperiment, ParallelExperiment, Options | ||
| from qiskit_experiments.curve_analysis.data_processing import probability | ||
| from .rb_analysis import RBAnalysis | ||
| from .clifford_utils import CliffordUtils | ||
| from .rb_utils import RBUtils | ||
|
|
@@ -88,7 +88,12 @@ def __init__( | |
|
|
||
| # Set configurable options | ||
| self.set_experiment_options(lengths=list(lengths), num_samples=num_samples) | ||
| self.set_analysis_options(data_processor=probability(outcome="0" * self.num_qubits)) | ||
| self.set_analysis_options( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we use the get function of the data processor library?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I checked but we cannot use the getter. This is a custom processor for RB, because outcome depends on the number of qubits, and it counts for label "0" instead of "1". Of course we can modify the getter, but this should be done in the separate PR. |
||
| data_processor=dp.DataProcessor( | ||
| input_key="counts", | ||
| data_actions=[dp.DirichletProbability(outcome="0" * self.num_qubits)], | ||
| ) | ||
| ) | ||
|
|
||
| # Set fixed options | ||
| self._full_sampling = full_sampling | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.