Experiment data and analysis result classes#113
Conversation
|
What happened to |
|
In There are tests in |
I moved it to |
Good catch! It's fixed in e3a2d31 |
There was a problem hiding this comment.
The future-based execution management is really cool. I think this will improve the usability of experiment.
Sorry about my review is mainly questions. I'm still not clear the relationship (or purpose) among analysis result, experiment (stored) data, and service. All of them seem to have quite similar interface. And stored data sounds bit strange to me because it has function to store the data in the database.
Depending on our standpoint (from remote db vs from local package) the method name would be different. For example, ExperimentServiceV1.create_experiment seems to create an entry in the database thus experiment should be completed. However if we call this from local software it sounds like we are going to start new experiment. Perhaps those can be renamed more specifically, such as create_experiment_entry.
|
|
||
| def __init__( | ||
| self, | ||
| result_data: Dict, |
There was a problem hiding this comment.
Do you plan to define a template for this in another PR?
There was a problem hiding this comment.
The database accepts a free-form dictionary. However, I have noticed there are common keywords used in qiskit-experiments (like success), so maybe it's better to define a template specific to qiskit-experiments?
There was a problem hiding this comment.
I guess so. A dict subclass is also good for type annotation.
|
|
||
| Args: | ||
| result_data: Analysis result data. | ||
| result_type: Analysis result type. |
There was a problem hiding this comment.
Is this really a result type rather than type of analysis?
There was a problem hiding this comment.
The name also came from results db, which only has the notion of "analysis result" and not "analysis". Currently all outputs of an Analysis class have the same type, so we can say it's the analysis type. Although last time we talked about "minimal analysis result" - not sure how that changes things.
There was a problem hiding this comment.
I see. So this is not the field we can set arbitrary value?
There was a problem hiding this comment.
No, result_data is the one that allows arbitrary value
| if result.job_id not in self._jobs: | ||
| self._jobs[result.job_id] = None | ||
| for i in range(len(result.results)): | ||
| data = result.data(i) |
There was a problem hiding this comment.
I don't think raw data is necessary if we store memory data. @eggerdj
There was a problem hiding this comment.
My feel here is that we should not be throwing away any information when adding to the StoredData from the Result. If any changes to the data format are made I would assume that the user could at least control them.
| else: | ||
| fig_name = ( | ||
| f"figure_{self.experiment_id[:8]}_" | ||
| f"{datetime.now().isoformat()}_{len(self._figures)}" |
There was a problem hiding this comment.
I don't think we can retrieve the figure from the service with this name.
There was a problem hiding this comment.
I added the timestamp to a) keep the name unique and b) I thought it'd be nice to know when the figure was created. If you're worry about not being able to remember the name, figure_names gives back all the names.
There was a problem hiding this comment.
Sounds like we need to retrieve all figures anyways because we cannot guess what is drawn from the timestamp. Can you add some mechanism to pass image tag as I wrote in another comment?
There was a problem hiding this comment.
Analysis can give the figure a more meaningful name instead of using the default timestamp, if the concern is about finding the right figure. As I mentioned in my other comment, the db doesn't support tags for figures right now. Another alternative is to implement tags support locally in this class. For example, since metadata allows arbitrary value, I can keep track of the tags assigned to each figure there. That would be a separate PR though since it's a bit of work.
Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com>
|
Thanks @nkanazawa1989 for reviewing this! After reading your comments, I decided to make some name changes and hope that'd clear things up a bit. I renamed the subpackage to Let me know if the new names are even more confusing 😅
I can make that change, but it requires changes on the |
|
Thanks Jessie! I replied to some comments but almost all issues I found are addressed. I do like new module names! :) Just one more point: Can you describe the relationship between service, analysis result, and experiment data in the module docstring? This would help third party providers to use this framework. Especially, relationship between experimental data and analysis is very IBM-ish (sometime we update a part of analysis result, e.g. quality, without retrieving full experiment -- this is expected use case but based on our experience). |
| "quality": "Good", | ||
| "verified": False, | ||
| } | ||
| result = DbAnalysisResult(result_data={"foo": "bar"}, tags=["tag1", "tag2"], **attrs) |
There was a problem hiding this comment.
Didn't you mean to use DbAnalaysisResultV1 here instead ?
There was a problem hiding this comment.
DbAnalysisResultV1 is imported as DbAnalysisResult. The idea is when we have V2 we only need to change the import statement and not all the reference of V1 to V2 (assuming interface remains compatible).
eliarbel
left a comment
There was a problem hiding this comment.
Overall looks good to me, especially for the stage we at now, needing it to be merged and serve as a basis for PR #115 and then testing (which I think should and would be extensive). I'm still missing to see how ExperimentService connects with IBMExperimentService in provider, but that's probably a hole in my understanding of the bigger picture.
|
|
||
| return result_key | ||
|
|
||
| def analysis_result( |
There was a problem hiding this comment.
should probably named analysis_results as it may return multiple analysis results
| figure_names: Optional[Union[List[str], str]] = None, | ||
| overwrite: bool = False, | ||
| save_figure: Optional[bool] = None, | ||
| service: Optional["DatabaseServiceV1"] = None, |
There was a problem hiding this comment.
@jyu00 can you elaborate on the use-case that justifies having a service parameter here and in the other add_* and save methods ? What would be the reason not to use the default experiment service ? If using a different experiment service is needed, wouldn't the service setter suffice ?
There was a problem hiding this comment.
This came from when we planned on having a local database. The idea was that one may want to save the experiment in the cloud DB, except when there's a large amount of data, or temporary data, which they only want to save locally. For example, I create an experiment with auto-save turned on (in case I forget to save). Then I run different kinds of analysis and save the results in the local db. In the end I compare all these results and save the "good" one in the cloud (e.g. to share with others).
Since we don't have a local database, and it's unlikely that anyone would interact with the ExperimentData interface directly, we can also remove it to keep the interface simple.
There was a problem hiding this comment.
Thanks for the info. Yes, I think we should clean this param until it'll be needed (it's in multiple methods)
Added in e916c5f. |
mriedem
left a comment
There was a problem hiding this comment.
Looks good, I left several nits and questions inline. There are plenty of things that could be done in a follow up for additional functionality, but maybe some of those things are dependent on the actual IBM Quantum service implementation since this just seems to be the base class framework?
The only thing that I'd probably -1 on is the global vs public typo but that could also be fixed in a small follow up since I'm sure you're eager to get this moving.
| :class:`DbExperimentDataV1` is the main class that defines the structure of | ||
| experiment data, which consists of the following:: | ||
|
|
||
| * Results from circuit execution, which is called ``data`` in this class. |
| example, to retrieve a saved analysis result. | ||
|
|
||
| Currently only IBM Quantum provides this database service, but we plan on having | ||
| a local database service in the near future. |
There was a problem hiding this comment.
| a local database service in the near future. | |
| a local database service in the future. |
nit but I'd avoid saying "near" in these docs since time and priorities can be funny when docs are originally written, "near" turns into like a year from now. 😄
There was a problem hiding this comment.
Ha, I specifically said "near future" to make us look better lol. But I guess we should avoid any commitment in an open source project. Updated in 2e01e94.
|
|
||
| An experiment database service allows you to store experiment data and metadata | ||
| in a database. An experiment can have one or more jobs, analysis results, | ||
| and figures. |
There was a problem hiding this comment.
There is now also a generic "files" (aka artifacts) interface for uploading files associated with an experiment, e.g. configuration files, input parameter files and other post-experiment metadata collection. Support for that in Qiskit can always be added later though.
There was a problem hiding this comment.
It's not implemented in qiskit-ibmq-provider. When I asked, the response suggested it was for internal use only?
There was a problem hiding this comment.
It's not implemented in qiskit-ibmq-provider.
It's relatively new and I wasn't sure if qiskit-ibmq-provider is frozen for the experiment service while this repo is being developed.
When I asked, the response suggested it was for internal use only?
Blake has use cases for it being used in Qiskit publicly as well but I don't know all the details.
There was a problem hiding this comment.
Sounds like I'll need to follow up with Blake :)
| ) -> str: | ||
| """Create a new experiment in the database. | ||
|
|
||
| Args: |
There was a problem hiding this comment.
There is also the visibility parameter which will default to "private" for Qiskit experiments but options include: private, project, group, hub, public.
Those control who, besides the owner of the experiment (whoever created it), can view/list the experiment.
It's called share level in the IQX UI and is a similar concept to jobs but not exactly the same implementation.
There was a problem hiding this comment.
I'm trying to avoid IBMQ-specific fields, especially those that cannot be implemented by a local DB. It is implemented on the ibmq-provider side.
There was a problem hiding this comment.
It is implemented on the
ibmq-providerside.
So does qiskit-experiments (this PR I guess) define the interface for working with experiments in Qiskit and then you have different services that implement that interface, and for IBM Quantum that would be the code in qiskit-ibmq-provider? It wasn't clear to me how that's going to work (and I didn't read the large design doc 😄 ).
I was under the impression that the experiment service code in qiskit-ibmq-provider was essentially going to be frozen and be transitioned to this repo.
There was a problem hiding this comment.
qiskit-experiments defines the code to run the actual experiments. It can use IBM Quantum to run circuits and save results, but it doesn't have to. Qiskit is designed to work with different providers (IBM, Honeywell, Aer, etc), and there is no dependency on IBM Quantum.
This PR essentially defines how it would interact with an "experiment database service", if one is available. IBM Quantum is just the only one providing this service right now.
| ) -> None: | ||
| """Update an existing experiment. | ||
|
|
||
| Args: |
There was a problem hiding this comment.
You can also update visibility if you own the experiment, e.g. to make it public or share at some level in the provider.
| from qiskit_experiments.database_service.exceptions import DbExperimentDataError | ||
|
|
||
|
|
||
| class TestDbAnalysisResult(QiskitTestCase): |
There was a problem hiding this comment.
If these are all just unit tests you can ignore, but if at some point there are integration tests that are using the real service then it'd be good to tag experiments with something that lets us know they are from Qiskit CI testing for cleanup on the server side in case test case cleanup fails for some reason.
There was a problem hiding this comment.
These are all just unit tests. qiskit-experiments does not have a dependency on qiskit-ibmq-provider. There will be new integration tests in qiskit-ibmq-provider.
| values = { | ||
| "result_data": {"foo": "bar"}, | ||
| "result_type": "some_type", | ||
| "device_components": ["Q1", "Q1"], |
There was a problem hiding this comment.
Did you mean to have Q1 in here twice?
| """Test stored data attributes.""" | ||
| attrs = { | ||
| "job_ids": ["job1"], | ||
| "share_level": "global", |
There was a problem hiding this comment.
| "share_level": "global", | |
| "share_level": "public", |
| } | ||
| exp_data = DbExperimentData( | ||
| backend=self.backend, | ||
| experiment_type="qiskit_test", |
There was a problem hiding this comment.
OK this is probably sufficient to let us know if these were ever part of an integration test that we could cleanup orphans on the real service.
|
|
||
| service = self._set_mock_service() | ||
| exp_data = DbExperimentData(backend=self.backend, experiment_type="qiskit_test") | ||
| exp_data.add_figures(figure, save_figure=True) |
There was a problem hiding this comment.
Does this actually write an svg file to disk? If so it'd be good to have a self.addCleanup call to remove the generated file if any (for local testing anyway).
There was a problem hiding this comment.
It doesn't. It uses a mock which does nothing other than letting me verify the proper methods are called.
There was a problem hiding this comment.
@mriedem Thanks for reviewing this. A lot of your comments in database_service.py are implemented in qiskit-ibmq-provider (see this PR). The database service interface defined here is supposed to be generic, and I try to avoid anything that's IBMQ only (not completely successful, but I try), especially things that cannot be implemented in a local database.
|
|
||
| An experiment database service allows you to store experiment data and metadata | ||
| in a database. An experiment can have one or more jobs, analysis results, | ||
| and figures. |
There was a problem hiding this comment.
It's not implemented in qiskit-ibmq-provider. When I asked, the response suggested it was for internal use only?
| ) -> str: | ||
| """Create a new experiment in the database. | ||
|
|
||
| Args: |
There was a problem hiding this comment.
I'm trying to avoid IBMQ-specific fields, especially those that cannot be implemented by a local DB. It is implemented on the ibmq-provider side.
| A dictionary containing the retrieved experiment data. | ||
|
|
||
| Raises: | ||
| ExperimentEntryNotFound: If the experiment does not exist. |
There was a problem hiding this comment.
I don't think it's wrong to raise ExperimentEntryNotFound for an experiment you don't have access to. Because as far as you're concerned, the experiment doesn't exist.
| device_components: Target device components, such as qubits. | ||
| tags: Tags to be associated with the analysis result. | ||
| quality: Quality of this analysis. | ||
| verified: Whether the result quality has been verified. |
There was a problem hiding this comment.
That how results DB users use it today. But there is no requirement that says a human has to verify something to set this flag - again trying to keep things generic.
| class DbAnalysisResult: | ||
| """Base common type for all versioned AnalysisResult abstract classes. | ||
|
|
||
| Note this class should not be inherited from directly, it is intended |
There was a problem hiding this comment.
Because the comment is telling people not to do MyCustomClass(DbAnalysisResult), and nothing in abc would prevent that.
| """Add the experiment figure. | ||
|
|
||
| Args: | ||
| figures (str or bytes or pyplot.Figure or list): Names of the figure |
| """ | ||
| if isinstance(result_key, int): | ||
| result_key = self._analysis_results.keys()[result_key] | ||
| elif result_key not in self._analysis_results: |
| """Return the class name and version.""" | ||
| return self._source | ||
|
|
||
| def __str__(self): |
There was a problem hiding this comment.
It was intentional since I didn't want the output to be a page long lol. Having said that, backend and tags seem important enough to be included, so I added them in 2e01e94.
| values = { | ||
| "result_data": {"foo": "bar"}, | ||
| "result_type": "some_type", | ||
| "device_components": ["Q1", "Q1"], |
| :class:`DbExperimentDataV1` is the main class that defines the structure of | ||
| experiment data, which consists of the following:: | ||
|
|
||
| * Results from circuit execution, which is called ``data`` in this class. |
|
@nkanazawa1989 @eliarbel @yaelbh I think this is ready to be merged. A few things not yet addressed in this PR:
1 & 2 will take a bit of effort, and 3 I need to think a bit more about how it integrates with the analysis classes. |
nkanazawa1989
left a comment
There was a problem hiding this comment.
Thanks Jessie! This LGTM :) Looking forward to using experiment service!
|
I'll update #115 |
* move from terra * run black * convert service exception to log * fix lint * log post processing failure * remove data index * fix test can package typo * remove experiment_class and result_class * fix lint * doc update Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com> * rename classes * review comments * fix lint * review comments * fix lint * add db service to doc * fix doc * remove extra service keyword * fix type hint * review comments Co-authored-by: Yael Ben-Haim <yaelbh@il.ibm.com> Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com>
* move from terra * run black * convert service exception to log * fix lint * log post processing failure * remove data index * fix test can package typo * remove experiment_class and result_class * fix lint * doc update Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com> * rename classes * review comments * fix lint * review comments * fix lint * add db service to doc * fix doc * remove extra service keyword * fix type hint * review comments Co-authored-by: Yael Ben-Haim <yaelbh@il.ibm.com> Co-authored-by: Naoki Kanazawa <nkanazawa1989@gmail.com>

Summary
This is a replacement of Qiskit/qiskit#5499. Most of the code remains the same, but I renamed
ExperimentDatatoStoredDatasince qiskit-experiments has its ownExperimentDataclass. The attributes inStoredDatacontinue to reference "experiment" (e.g.experiment_type), since the DB service is still called experiment service.Details and comments