Update CompositeAnalysis initialization#633
Conversation
nkanazawa1989
left a comment
There was a problem hiding this comment.
Thanks Chris I think this is reasonable direction, i.e. we don't need to get analysis from experiment, thus analysis and experiment data will become self-contained for running post-processing.
I also like new approach to set option method.
(before)
composite_exp.component_experiment(0).analysis.set_options(...)
(now)
composite_exp.component_analysis(0).set_option(...)However seems like we still need experiment just for initialization of container, which I think not necessary existing in analysis.
| # IDs for each child experiment which can change when re-running analysis | ||
| # if replace_results=False, so that we update the correct child data | ||
| # for each component experiment | ||
| component_index = experiment_data.metadata.get("component_child_index", []) |
There was a problem hiding this comment.
Can this be self.component_child_index of composite analysis since its stateful now? This is likely only used by running analysis and no need to be statically kept in the metadata.
There was a problem hiding this comment.
It is a really property of the experiment data, not the analysis, and is required if you re-run analysis to update existing results for containers that could contain other child data not related to the composite experiment (if someone manually added child data for some reason)
| start_index = len(experiment_data.child_data()) | ||
| component_index = [] | ||
| for i, sub_exp in enumerate(experiment.component_experiment()): | ||
| sub_data = sub_exp._initialize_experiment_data() |
There was a problem hiding this comment.
Can this be called before analysis is called? For example we can insert hook method which might be called before job execution. Then we can completely decouple experiment from the experiment data. I think initialization of container should be a part of composite experiment rather than analysis.
There was a problem hiding this comment.
This is done by experiment.run. This block here is only called if you try and run analysis on a data that was not initialized via an experiment.run, such as if you loaded job ids and manually added composite experiment jobs and then re-run analysis.
There was a problem hiding this comment.
Also I should point out this block code isn't something added in this PR, I just moved it, and added an extra warning.
| experiment_data.add_child_data(sub_data) | ||
| component_index.append(start_index + i) | ||
| experiment_data.metadata["component_child_index"] = component_index | ||
|
|
There was a problem hiding this comment.
Probably
| else: | |
| warnings.warn("Child experiment data have been already initialized.", UserWarning) |
likely this is singleton and this is evidence of user might be doing something unexpected.
There was a problem hiding this comment.
The child experiment data should already have been initialized
| for sub_exp_data, sub_analysis in zip(component_exp_data, self._analyses): | ||
| # Since copy for replace result is handled at the parent level | ||
| # we always run with replace result on component analysis | ||
| sub_analysis.run(sub_exp_data, replace_results=True) |
There was a problem hiding this comment.
Can you write unittest to check this? I feel the mapping of experiment and analysis is no longer obvious as before and we need to guarantee new mechanism works correctly. Something like this could be worth testing (or more complicated, such as nested parallel into batch and vise versa).
exp1 = T1(...)
exp2 = StandardRB(...)
exp = BatchExperiment([exp1, exp2])
self.assertExperimentDone(exp)This should success since this line cannot be modified by users without hack, but someone may update the logic in future.
https://github.com/Qiskit/qiskit-experiments/pull/633/files#diff-dfb1550e8c0375a7cf9c4a2e0f5ce96f30b17125796008307d1e9793af7444a7R45
There was a problem hiding this comment.
There are already a lot of composite experiment tests but i think they all use the same component experiment, it would be good to add some tests that mix different experiments like you suggest
yaelbh
left a comment
There was a problem hiding this comment.
I'm not familiar with the HEAT experiment or other experiments that are the targets of this PR, except for Tphi, which I've just reviewed (#355). The code of Tphi, using the current interface, is nice and clean, and I don't see how it will benefit from this PR. The PR touches code of composite analysis. This code has been unstable recently, bacuse of several bugs, that were not discovered in time due to lack of proper testing and documentation. Now, finally, the code looks stable, and we've been able to use it in monitoring. I'm concerned with the intention to modify it again, potentially returning unstability. I'd do it only:
- After work on extensive testing and documentation is complete (if not done yet).
- If there is a good reason to prefer the suggested solution over alternatives that don't modify the core classes (as in #355).
Some comments and questions:
- The sub-experiments initialize an instance of the analysis class. So we have two instances of analyses, one in the sub-experiment and the other one is initialized by the composite analysis. This duality is certainly an unhealthy situation. What will be the relation between the two? Are they going to be synchronized? What happens if I change the sub-experiment analysis, say from T1 to P1? There will be even more than two for grandchildren.
- "This removes the dependence on having the full composite experiment stored in the experiment data" - can you please provide more details? What does it mean and what's the issue with this?
- Is the code robust with nested trees? By the way (not related to this PR though) is the handling of replace_results=False robust with nested trees?
|
Question1: Question2: Question3: (Added) |
|
@Yael The main issue this PR is aiming to address is that CompositeAnalysis is currently fundamentally different from all other analysis because it doesn't separate experiments and analysis. It can't be run directly on loaded data because it doesn't contain any information about the actually analysis to be performed, that is all contained in the composite experiment. This PR aims to fix that so that you can initialize a composite analysis object independently and analyze composite data. The actual running on analysis and everything else is the same as before. |
|
Thanks @chriseclectic for the answer. I now understand the goal of the PR and agree that it's important. With this understanding, I'd like to review it a bit more, is it OK if I review only on Sunday? I've already started the weekend. @nkanazawa1989 I don't understand your answer to Question 1. If we do (copied from above) and we also have an instantiation of |
Update CompositeAnalysis to be initialized with a list of analysis class objects. This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.
Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis.
90c9af9 to
14806e3
Compare
|
For example this is the interface of batch experiment my_analysis = CompositeAnalysis([my_sub_analysis1, my_sub_analysis2])
exp1 = ExpX(analysis=exp1analysis)
exp2 = ExpY(analysis=exp2analysis)
batch = BatchExperiment([exp1, exp2])
batch.analysis = my_analysisyou don't expect you are creating the mixture of (exp1analysis, exp2analysis) and (my_sub_analysis1, my_sub_analysis2). From user's perspective there is nothing changed except for easy access to analysis options. |
|
@Yael sounds good. On subclassing if you hardcode the analysis objects in the analysis subclass you are able to do post-run analysis of data like But like you say this means there now may be two copies of analysis classes if your experiment subclass also included it. Maybe we need to make the base For some of these other subclass experiments if to the end user they are supposed to just look like a single experiment (and running as composite is an implementation detail they don't need to know about), it would be nice if all the relevant experiment/analysis option and results/figures can be set and accessed through the main objects, not the components. |
nkanazawa1989
left a comment
There was a problem hiding this comment.
Seems like new tests cover the point we must consider. Thanks.
yaelbh
left a comment
There was a problem hiding this comment.
At some point one of us (us developers, not a user) will write
exp.component_experiment(0).analysis.set_options(...)
instead of
exp.component_analysis(0).set_options(...)
(this can happen for a batch or a parallel experiment where also in the future we expect to keep component_experiment active).
I tried to find a better solution, but each solution that I could think of has its own cons.
| """ | ||
| self._experiments = experiments | ||
| self._num_experiments = len(experiments) | ||
| analysis = CompositeAnalysis([exp.analysis for exp in self._experiments]) |
There was a problem hiding this comment.
I agree with @yaelbh 's concern. Probably this will fix the problem.
| analysis = CompositeAnalysis([exp.analysis for exp in self._experiments]) | |
| analyses = [] | |
| for exp in self._experiments: | |
| analyses.append(exp.analysis) | |
| exp.analysis = None |
There was a problem hiding this comment.
After that you won't be able anymore to run analysis on a sub-experiment, independently from it being part of a composite experiment.
|
@yaelbh Currently Where his connection will break is if you assign a new analysis object to the component experiment like |
This commit updates the logic of HeatAnalysis constructor according to qiskit-community#633 (now we can instantiate composite analysis with component analyses). Also BatchHeatHelper class is removed and replaced with decorators.
* Update CompositeAnalysis initialization Update CompositeAnalysis to be initialized with a list of analysis class objects. This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments. * Restructure CompositeAnalysis._run_analysis Adds a separate helper function to the class to return the list of component experiment data containing the marginalized data. This can be used by subclasses if necessary to obtain marginalized data containers without running analysis. * Add additional tests
Summary
Update CompositeAnalysis to be initialized with a list of analysis class objects.
Details and comments
This removes the dependence on having the full composite experiment stored in the experiment data to perform composite analysis and makes CompositeAnalysis more natural to subclass for specific cases of component experiments.
For experiments like HEAT and Tphi that use a subclass of composite analysis with fixed component experiments and analysis this is intended so that they can be hard-coded into the analysis class such as: