Skip to content

Update CompositeAnalysis initialization#633

Merged
chriseclectic merged 4 commits into
qiskit-community:mainfrom
chriseclectic:comp-analysis
Feb 1, 2022
Merged

Update CompositeAnalysis initialization#633
chriseclectic merged 4 commits into
qiskit-community:mainfrom
chriseclectic:comp-analysis

Conversation

@chriseclectic
Copy link
Copy Markdown
Collaborator

Summary

Update CompositeAnalysis to be initialized with a list of analysis class objects.

Details and comments

This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.

For experiments like HEAT and Tphi that use a subclass of composite analysis with fixed component experiments and analysis this is intended so that they can be hard-coded into the analysis class such as:

class AnalysisClass(CompositeAnalysis):

     def __init__(self):
          super().__init__([SpecificAnalysis1(), SpecificAnalysis2()]

     def _run_analysis(self, experiment_data):
          super()._run_analysis(experiment_data):
          # Do extra analysis that depends on component analysis results
          # ...

@chriseclectic chriseclectic added the Changelog: API Change Include in the "Changed" section of the changelog label Jan 26, 2022
Copy link
Copy Markdown
Collaborator

@nkanazawa1989 nkanazawa1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chris I think this is reasonable direction, i.e. we don't need to get analysis from experiment, thus analysis and experiment data will become self-contained for running post-processing.

I also like new approach to set option method.

(before)
composite_exp.component_experiment(0).analysis.set_options(...)

(now)
composite_exp.component_analysis(0).set_option(...)

However seems like we still need experiment just for initialization of container, which I think not necessary existing in analysis.

# IDs for each child experiment which can change when re-running analysis
# if replace_results=False, so that we update the correct child data
# for each component experiment
component_index = experiment_data.metadata.get("component_child_index", [])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be self.component_child_index of composite analysis since its stateful now? This is likely only used by running analysis and no need to be statically kept in the metadata.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a really property of the experiment data, not the analysis, and is required if you re-run analysis to update existing results for containers that could contain other child data not related to the composite experiment (if someone manually added child data for some reason)

start_index = len(experiment_data.child_data())
component_index = []
for i, sub_exp in enumerate(experiment.component_experiment()):
sub_data = sub_exp._initialize_experiment_data()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be called before analysis is called? For example we can insert hook method which might be called before job execution. Then we can completely decouple experiment from the experiment data. I think initialization of container should be a part of composite experiment rather than analysis.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @nkanazawa1989

Copy link
Copy Markdown
Collaborator Author

@chriseclectic chriseclectic Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done by experiment.run. This block here is only called if you try and run analysis on a data that was not initialized via an experiment.run, such as if you loaded job ids and manually added composite experiment jobs and then re-run analysis.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I should point out this block code isn't something added in this PR, I just moved it, and added an extra warning.

experiment_data.add_child_data(sub_data)
component_index.append(start_index + i)
experiment_data.metadata["component_child_index"] = component_index

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably

Suggested change
else:
warnings.warn("Child experiment data have been already initialized.", UserWarning)

likely this is singleton and this is evidence of user might be doing something unexpected.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The child experiment data should already have been initialized

for sub_exp_data, sub_analysis in zip(component_exp_data, self._analyses):
# Since copy for replace result is handled at the parent level
# we always run with replace result on component analysis
sub_analysis.run(sub_exp_data, replace_results=True)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write unittest to check this? I feel the mapping of experiment and analysis is no longer obvious as before and we need to guarantee new mechanism works correctly. Something like this could be worth testing (or more complicated, such as nested parallel into batch and vise versa).

exp1 = T1(...)
exp2 = StandardRB(...)

exp = BatchExperiment([exp1, exp2])
self.assertExperimentDone(exp)

This should success since this line cannot be modified by users without hack, but someone may update the logic in future.
https://github.com/Qiskit/qiskit-experiments/pull/633/files#diff-dfb1550e8c0375a7cf9c4a2e0f5ce96f30b17125796008307d1e9793af7444a7R45

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are already a lot of composite experiment tests but i think they all use the same component experiment, it would be good to add some tests that mix different experiments like you suggest

Copy link
Copy Markdown
Collaborator

@yaelbh yaelbh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the HEAT experiment or other experiments that are the targets of this PR, except for Tphi, which I've just reviewed (#355). The code of Tphi, using the current interface, is nice and clean, and I don't see how it will benefit from this PR. The PR touches code of composite analysis. This code has been unstable recently, bacuse of several bugs, that were not discovered in time due to lack of proper testing and documentation. Now, finally, the code looks stable, and we've been able to use it in monitoring. I'm concerned with the intention to modify it again, potentially returning unstability. I'd do it only:

  • After work on extensive testing and documentation is complete (if not done yet).
  • If there is a good reason to prefer the suggested solution over alternatives that don't modify the core classes (as in #355).

Some comments and questions:

  1. The sub-experiments initialize an instance of the analysis class. So we have two instances of analyses, one in the sub-experiment and the other one is initialized by the composite analysis. This duality is certainly an unhealthy situation. What will be the relation between the two? Are they going to be synchronized? What happens if I change the sub-experiment analysis, say from T1 to P1? There will be even more than two for grandchildren.
  2. "This removes the dependence on having the full composite experiment stored in the experiment data" - can you please provide more details? What does it mean and what's the issue with this?
  3. Is the code robust with nested trees? By the way (not related to this PR though) is the handling of replace_results=False robust with nested trees?

@nkanazawa1989
Copy link
Copy Markdown
Collaborator

nkanazawa1989 commented Jan 27, 2022

Question1:
This is not the case because CompositeExperiment doesn't take analysis classes. This only takes experiments, and populate composite analysis with one from child experiments. Even if you instantiate composite analysis, there is no race issue because you cannot instantiate composite experiment with it. Since it has a setter method you can explicitly override analysis, in this case, you know you have custom analysis instead of child defaults.

Question2:
I understand this is design issue and doesn't block to implement any experiment. However, the complexity of the mechanism gives more overhead for the maintainers, and also confuse the developer who wants to hack the codebase.

Question3:
This is likely robust but I've already commented on this #633 (comment)

(Added)
The user benefit from this API change should be easy access to analysis options.

@chriseclectic
Copy link
Copy Markdown
Collaborator Author

@Yael The main issue this PR is aiming to address is that CompositeAnalysis is currently fundamentally different from all other analysis because it doesn't separate experiments and analysis. It can't be run directly on loaded data because it doesn't contain any information about the actually analysis to be performed, that is all contained in the composite experiment. This PR aims to fix that so that you can initialize a composite analysis object independently and analyze composite data. The actual running on analysis and everything else is the same as before.

@yaelbh
Copy link
Copy Markdown
Collaborator

yaelbh commented Jan 27, 2022

Thanks @chriseclectic for the answer. I now understand the goal of the PR and agree that it's important. With this understanding, I'd like to review it a bit more, is it OK if I review only on Sunday? I've already started the weekend.

@nkanazawa1989 I don't understand your answer to Question 1. If we do (copied from above)

def __init__(self):
     super().__init__([SpecificAnalysis1(), SpecificAnalysis2()]

and we also have an instantiation of SpecificAnalysis1 in the constructor of SpecificExperiment1, then we end up with two instantiations.

Update CompositeAnalysis to be initialized with a list of analysis class objects. This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.
Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis.
@nkanazawa1989
Copy link
Copy Markdown
Collaborator

nkanazawa1989 commented Jan 27, 2022

For example this is the interface of batch experiment
https://github.com/Qiskit/qiskit-experiments/blob/10803ad838cce93d39a6960492aa3e3d650b56cd/qiskit_experiments/framework/composite/batch_experiment.py#L43
as you see you cannot set your composite analysis instance here. This is automatically created with child experiment analysis. However, if you explicitly do

my_analysis = CompositeAnalysis([my_sub_analysis1, my_sub_analysis2])
exp1 = ExpX(analysis=exp1analysis)
exp2 = ExpY(analysis=exp2analysis)

batch = BatchExperiment([exp1, exp2])
batch.analysis = my_analysis

you don't expect you are creating the mixture of (exp1analysis, exp2analysis) and (my_sub_analysis1, my_sub_analysis2). From user's perspective there is nothing changed except for easy access to analysis options.

@chriseclectic
Copy link
Copy Markdown
Collaborator Author

@Yael sounds good. On subclassing if you hardcode the analysis objects in the analysis subclass you are able to do post-run analysis of data like HeatAnalysis().run(loaded_data) without having to reconstruct the original experiment to get the analysis classes (which is currently required if you subclass the current composite experiment / analysis).

But like you say this means there now may be two copies of analysis classes if your experiment subclass also included it. Maybe we need to make the base CompositeExperiment not expose its components, and only the user facing subclasses that explicitly want to expose those (Batch and Parallel in this case) have the component_experiment method.

For some of these other subclass experiments if to the end user they are supposed to just look like a single experiment (and running as composite is an implementation detail they don't need to know about), it would be nice if all the relevant experiment/analysis option and results/figures can be set and accessed through the main objects, not the components.

Copy link
Copy Markdown
Collaborator

@nkanazawa1989 nkanazawa1989 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like new tests cover the point we must consider. Thanks.

Copy link
Copy Markdown
Collaborator

@yaelbh yaelbh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point one of us (us developers, not a user) will write

exp.component_experiment(0).analysis.set_options(...)

instead of

exp.component_analysis(0).set_options(...)

(this can happen for a batch or a parallel experiment where also in the future we expect to keep component_experiment active).

I tried to find a better solution, but each solution that I could think of has its own cons.

"""
self._experiments = experiments
self._num_experiments = len(experiments)
analysis = CompositeAnalysis([exp.analysis for exp in self._experiments])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @yaelbh 's concern. Probably this will fix the problem.

Suggested change
analysis = CompositeAnalysis([exp.analysis for exp in self._experiments])
analyses = []
for exp in self._experiments:
analyses.append(exp.analysis)
exp.analysis = None

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After that you won't be able anymore to run analysis on a sub-experiment, independently from it being part of a composite experiment.

@chriseclectic
Copy link
Copy Markdown
Collaborator Author

@yaelbh Currently exp.component_experiment(0).analysis.set_options is equivalent to exp.component_analysis(0).set_options(...) since the current initialization ensures that they will be references to the same analysis objects (this is was one of the new test cases I added)

Where his connection will break is if you assign a new analysis object to the component experiment like
component_exp.analysis = new_analysis, which will not update the analysis object in CompositeAnalysis class. I couldn't think of any straightforward way to avoid this but I would hope it's not something that should be done often, and if you really want to change analysis object entirely you should probably initialize a new CompositeAnalysis object or change it via CompositeAnalysis object directly.

@chriseclectic chriseclectic merged commit 2cba61a into qiskit-community:main Feb 1, 2022
nkanazawa1989 added a commit to nkanazawa1989/qiskit-experiments that referenced this pull request Feb 2, 2022
This commit updates the logic of HeatAnalysis constructor according to qiskit-community#633 (now we can instantiate composite analysis with component analyses). Also BatchHeatHelper class is removed and replaced with decorators.
paco-ri pushed a commit to paco-ri/qiskit-experiments that referenced this pull request Jul 11, 2022
* Update CompositeAnalysis initialization

Update CompositeAnalysis to be initialized with a list of analysis class objects. This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.

* Restructure CompositeAnalysis._run_analysis

Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis.

* Add additional tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: API Change Include in the "Changed" section of the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants