Update CompositeAnalysis initialization by chriseclectic · Pull Request #633 · qiskit-community/qiskit-experiments

chriseclectic · 2022-01-26T17:09:34Z

Summary

Update CompositeAnalysis to be initialized with a list of analysis class objects.

Details and comments

This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.

For experiments like HEAT and Tphi that use a subclass of composite analysis with fixed component experiments and analysis this is intended so that they can be hard-coded into the analysis class such as:

class AnalysisClass(CompositeAnalysis):

     def __init__(self):
          super().__init__([SpecificAnalysis1(), SpecificAnalysis2()]

     def _run_analysis(self, experiment_data):
          super()._run_analysis(experiment_data):
          # Do extra analysis that depends on component analysis results
          # ...

nkanazawa1989

Thanks Chris I think this is reasonable direction, i.e. we don't need to get analysis from experiment, thus analysis and experiment data will become self-contained for running post-processing.

I also like new approach to set option method.

(before)
composite_exp.component_experiment(0).analysis.set_options(...)

(now)
composite_exp.component_analysis(0).set_option(...)

However seems like we still need experiment just for initialization of container, which I think not necessary existing in analysis.

nkanazawa1989 · 2022-01-27T05:23:56Z

+        # IDs for each child experiment which can change when re-running analysis
+        # if replace_results=False, so that we update the correct child data
+        # for each component experiment
+        component_index = experiment_data.metadata.get("component_child_index", [])


Can this be self.component_child_index of composite analysis since its stateful now? This is likely only used by running analysis and no need to be statically kept in the metadata.

It is a really property of the experiment data, not the analysis, and is required if you re-run analysis to update existing results for containers that could contain other child data not related to the composite experiment (if someone manually added child data for some reason)

nkanazawa1989 · 2022-01-27T05:35:37Z

+            start_index = len(experiment_data.child_data())
+            component_index = []
+            for i, sub_exp in enumerate(experiment.component_experiment()):
+                sub_data = sub_exp._initialize_experiment_data()


Can this be called before analysis is called? For example we can insert hook method which might be called before job execution. Then we can completely decouple experiment from the experiment data. I think initialization of container should be a part of composite experiment rather than analysis.

I agree with @nkanazawa1989

This is done by experiment.run. This block here is only called if you try and run analysis on a data that was not initialized via an experiment.run, such as if you loaded job ids and manually added composite experiment jobs and then re-run analysis.

Fair enough.

Also I should point out this block code isn't something added in this PR, I just moved it, and added an extra warning.

nkanazawa1989 · 2022-01-27T05:42:07Z

+                experiment_data.add_child_data(sub_data)
+                component_index.append(start_index + i)
+            experiment_data.metadata["component_child_index"] = component_index
+


Probably

Suggested change

else:

warnings.warn("Child experiment data have been already initialized.", UserWarning)

likely this is singleton and this is evidence of user might be doing something unexpected.

The child experiment data should already have been initialized

nkanazawa1989 · 2022-01-27T05:45:51Z

+        for sub_exp_data, sub_analysis in zip(component_exp_data, self._analyses):
+            # Since copy for replace result is handled at the parent level
+            # we always run with replace result on component analysis
+            sub_analysis.run(sub_exp_data, replace_results=True)


Can you write unittest to check this? I feel the mapping of experiment and analysis is no longer obvious as before and we need to guarantee new mechanism works correctly. Something like this could be worth testing (or more complicated, such as nested parallel into batch and vise versa).

exp1 = T1(...) exp2 = StandardRB(...) exp = BatchExperiment([exp1, exp2]) self.assertExperimentDone(exp)

This should success since this line cannot be modified by users without hack, but someone may update the logic in future.
https://github.com/Qiskit/qiskit-experiments/pull/633/files#diff-dfb1550e8c0375a7cf9c4a2e0f5ce96f30b17125796008307d1e9793af7444a7R45

There are already a lot of composite experiment tests but i think they all use the same component experiment, it would be good to add some tests that mix different experiments like you suggest

yaelbh

I'm not familiar with the HEAT experiment or other experiments that are the targets of this PR, except for Tphi, which I've just reviewed (#355). The code of Tphi, using the current interface, is nice and clean, and I don't see how it will benefit from this PR. The PR touches code of composite analysis. This code has been unstable recently, bacuse of several bugs, that were not discovered in time due to lack of proper testing and documentation. Now, finally, the code looks stable, and we've been able to use it in monitoring. I'm concerned with the intention to modify it again, potentially returning unstability. I'd do it only:

After work on extensive testing and documentation is complete (if not done yet).
If there is a good reason to prefer the suggested solution over alternatives that don't modify the core classes (as in #355).

Some comments and questions:

The sub-experiments initialize an instance of the analysis class. So we have two instances of analyses, one in the sub-experiment and the other one is initialized by the composite analysis. This duality is certainly an unhealthy situation. What will be the relation between the two? Are they going to be synchronized? What happens if I change the sub-experiment analysis, say from T1 to P1? There will be even more than two for grandchildren.
"This removes the dependence on having the full composite experiment stored in the experiment data" - can you please provide more details? What does it mean and what's the issue with this?
Is the code robust with nested trees? By the way (not related to this PR though) is the handling of replace_results=False robust with nested trees?

nkanazawa1989 · 2022-01-27T15:50:02Z

Question1:
This is not the case because CompositeExperiment doesn't take analysis classes. This only takes experiments, and populate composite analysis with one from child experiments. Even if you instantiate composite analysis, there is no race issue because you cannot instantiate composite experiment with it. Since it has a setter method you can explicitly override analysis, in this case, you know you have custom analysis instead of child defaults.

Question2:
I understand this is design issue and doesn't block to implement any experiment. However, the complexity of the mechanism gives more overhead for the maintainers, and also confuse the developer who wants to hack the codebase.

Question3:
This is likely robust but I've already commented on this #633 (comment)

(Added)
The user benefit from this API change should be easy access to analysis options.

chriseclectic · 2022-01-27T15:57:25Z

@Yael The main issue this PR is aiming to address is that CompositeAnalysis is currently fundamentally different from all other analysis because it doesn't separate experiments and analysis. It can't be run directly on loaded data because it doesn't contain any information about the actually analysis to be performed, that is all contained in the composite experiment. This PR aims to fix that so that you can initialize a composite analysis object independently and analyze composite data. The actual running on analysis and everything else is the same as before.

yaelbh · 2022-01-27T16:21:42Z

Thanks @chriseclectic for the answer. I now understand the goal of the PR and agree that it's important. With this understanding, I'd like to review it a bit more, is it OK if I review only on Sunday? I've already started the weekend.

@nkanazawa1989 I don't understand your answer to Question 1. If we do (copied from above)

def __init__(self):
     super().__init__([SpecificAnalysis1(), SpecificAnalysis2()]

and we also have an instantiation of SpecificAnalysis1 in the constructor of SpecificExperiment1, then we end up with two instantiations.

Update CompositeAnalysis to be initialized with a list of analysis class objects. This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.

Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis.

nkanazawa1989 · 2022-01-27T17:00:20Z

For example this is the interface of batch experiment
https://github.com/Qiskit/qiskit-experiments/blob/10803ad838cce93d39a6960492aa3e3d650b56cd/qiskit_experiments/framework/composite/batch_experiment.py#L43
as you see you cannot set your composite analysis instance here. This is automatically created with child experiment analysis. However, if you explicitly do

my_analysis = CompositeAnalysis([my_sub_analysis1, my_sub_analysis2])
exp1 = ExpX(analysis=exp1analysis)
exp2 = ExpY(analysis=exp2analysis)

batch = BatchExperiment([exp1, exp2])
batch.analysis = my_analysis

you don't expect you are creating the mixture of (exp1analysis, exp2analysis) and (my_sub_analysis1, my_sub_analysis2). From user's perspective there is nothing changed except for easy access to analysis options.

chriseclectic · 2022-01-27T17:00:48Z

@Yael sounds good. On subclassing if you hardcode the analysis objects in the analysis subclass you are able to do post-run analysis of data like HeatAnalysis().run(loaded_data) without having to reconstruct the original experiment to get the analysis classes (which is currently required if you subclass the current composite experiment / analysis).

But like you say this means there now may be two copies of analysis classes if your experiment subclass also included it. Maybe we need to make the base CompositeExperiment not expose its components, and only the user facing subclasses that explicitly want to expose those (Batch and Parallel in this case) have the component_experiment method.

For some of these other subclass experiments if to the end user they are supposed to just look like a single experiment (and running as composite is an implementation detail they don't need to know about), it would be nice if all the relevant experiment/analysis option and results/figures can be set and accessed through the main objects, not the components.

nkanazawa1989

Seems like new tests cover the point we must consider. Thanks.

yaelbh

At some point one of us (us developers, not a user) will write

exp.component_experiment(0).analysis.set_options(...)

instead of

exp.component_analysis(0).set_options(...)

(this can happen for a batch or a parallel experiment where also in the future we expect to keep component_experiment active).

I tried to find a better solution, but each solution that I could think of has its own cons.

nkanazawa1989 · 2022-01-31T05:12:57Z

        """
        self._experiments = experiments
        self._num_experiments = len(experiments)
+        analysis = CompositeAnalysis([exp.analysis for exp in self._experiments])


I agree with @yaelbh 's concern. Probably this will fix the problem.

Suggested change

analysis = CompositeAnalysis([exp.analysis for exp in self._experiments])

analyses = []

for exp in self._experiments:

analyses.append(exp.analysis)

exp.analysis = None

After that you won't be able anymore to run analysis on a sub-experiment, independently from it being part of a composite experiment.

chriseclectic · 2022-02-01T18:06:42Z

@yaelbh Currently exp.component_experiment(0).analysis.set_options is equivalent to exp.component_analysis(0).set_options(...) since the current initialization ensures that they will be references to the same analysis objects (this is was one of the new test cases I added)

Where his connection will break is if you assign a new analysis object to the component experiment like
component_exp.analysis = new_analysis, which will not update the analysis object in CompositeAnalysis class. I couldn't think of any straightforward way to avoid this but I would hope it's not something that should be done often, and if you really want to change analysis object entirely you should probably initialize a new CompositeAnalysis object or change it via CompositeAnalysis object directly.

This commit updates the logic of HeatAnalysis constructor according to qiskit-community#633 (now we can instantiate composite analysis with component analyses). Also BatchHeatHelper class is removed and replaced with decorators.

* Update CompositeAnalysis initialization Update CompositeAnalysis to be initialized with a list of analysis class objects. This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments. * Restructure CompositeAnalysis._run_analysis Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis. * Add additional tests

chriseclectic added the Changelog: API Change Include in the "Changed" section of the changelog label Jan 26, 2022

chriseclectic requested a review from nkanazawa1989 January 26, 2022 17:09

nkanazawa1989 suggested changes Jan 27, 2022

View reviewed changes

yaelbh requested changes Jan 27, 2022

View reviewed changes

chriseclectic added 3 commits January 27, 2022 11:29

Restruct CompositeAnalysis._run_analysis

9878819

Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis.

Add additional tests

14806e3

chriseclectic force-pushed the comp-analysis branch from 90c9af9 to 14806e3 Compare January 27, 2022 16:42

nkanazawa1989 approved these changes Jan 27, 2022

View reviewed changes

yaelbh approved these changes Jan 30, 2022

View reviewed changes

nkanazawa1989 reviewed Jan 31, 2022

View reviewed changes

Merge branch 'main' into comp-analysis

c69ea9a

chriseclectic merged commit 2cba61a into qiskit-community:main Feb 1, 2022


	else:
	warnings.warn("Child experiment data have been already initialized.", UserWarning)

-        analysis = CompositeAnalysis([exp.analysis for exp in self._experiments])
+        analyses = []
+        for exp in self._experiments:
+            analyses.append(exp.analysis)
+            exp.analysis = None

Conversation

chriseclectic commented Jan 26, 2022

Summary

Details and comments

Uh oh!

nkanazawa1989 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chriseclectic Jan 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yaelbh left a comment

Choose a reason for hiding this comment

Uh oh!

nkanazawa1989 commented Jan 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chriseclectic commented Jan 27, 2022

Uh oh!

yaelbh commented Jan 27, 2022

Uh oh!

nkanazawa1989 commented Jan 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chriseclectic commented Jan 27, 2022

Uh oh!

nkanazawa1989 left a comment

Choose a reason for hiding this comment

Uh oh!

yaelbh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chriseclectic commented Feb 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chriseclectic Jan 27, 2022 •

edited

Loading

nkanazawa1989 commented Jan 27, 2022 •

edited

Loading

nkanazawa1989 commented Jan 27, 2022 •

edited

Loading