Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add storage import and export features to command line and web API #1082

Merged
merged 79 commits into from
Aug 14, 2023

Conversation

notoraptor
Copy link
Collaborator

Description

HI @bouthilx ! This is a pull request to add export (dump) and import (load) features to Oríon, currently available via new command lines and new web API entries. New features use storage objects instead of underlying databases.

This should fix #953

NB: important remarks:

  • I had to add few modifications to BaseStorageProtocol
  • Overall, import/export may be slower when using storage instead of database when we import/export whole data. That's because we must call storage method for each data (benchmark or experiment) we want to manage, making a call to database for each manipulation. While, with database object, we could group queries and make one call to write/read many data at once.

Changes

  • Add a new module storage_backup that provides 2 functions, dump_database for export, and load_database for import.
  • Add new command lines using these functions, orion db dump and orion db load
  • Add new web API entries, /dump for export, /load for import, and /import-status to monitor import.

Checklist

Tests

  • I added corresponding tests for bug fixes and new features. If possible, the tests fail without the changes
  • All new and existing tests are passing ($ tox -e py38; replace 38 by your Python version if necessary)

Documentation

  • I have updated the relevant documentation related to my changes

Quality

  • I have read the CONTRIBUTING doc
  • My commits messages follow this format
  • My code follows the style guidelines ($ tox -e lint)

docs/src/user/storage.rst Outdated Show resolved Hide resolved
@bouthilx
Copy link
Member

The *.pickled files are PickledDB databases right? These should not be included in the repo and rather be generated automatically with the test fixtures.

notoraptor added 16 commits May 12, 2023 08:33
…g task info.

Encapsulate ImportTask to make sure lock is used when necessary.
…ot for python 3.7.

- In python 3.8, we can clear previous logging handlers before setting new stream just by using new argument `force` in logging.basicConfig
- In python 3.7, we must clear previous handlers manually before setting new stream.
- Now that storage_resource logging depends on root logging, we must set client logging in unit tests to be sure expected logging messages are printed in import sub-process, using caplog.
- Using caplog, we can rewrite logging unit test using simulated client, instead of launching a real sub-process server.
… already existed and *no* overwrite specified.
… should have been deleted if an error occurred)
@notoraptor notoraptor force-pushed the import-export-cli-and-webapi branch from a8406e7 to e6b23fc Compare May 12, 2023 13:46
… if no error occurred

- Move function _gen_host_file() from orion/serving/storage_resource to orion/core/utils and rename to generate_temporary_file().
three_experiments_branch_same_name_trials_benchmarks, capsys, testing_helpers
):
"""Test dump with overwrite argument"""
with tempfile.TemporaryDirectory() as tmp_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid this line by using the fixture tmp_path: https://docs.pytest.org/en/7.1.x/how-to/tmp_path.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (here and for all comments below): 91a10d9

three_experiments_branch_same_name_trials_benchmarks, capsys, testing_helpers
):
"""Test dump to a specified output file"""
with tempfile.TemporaryDirectory() as tmp_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

):
"""Test how dumped file is cleaned if dump fails."""

with tempfile.TemporaryDirectory() as tmp_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

three_experiments_branch_same_name_trials_benchmarks, capsys
):
"""Test dump unknown experiment"""
with tempfile.TemporaryDirectory() as tmp_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

algo_state,
):
"""Test dump experiment test_single_exp"""
with tempfile.TemporaryDirectory() as tmp_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

@Delaunay Delaunay merged commit f485f62 into Epistimio:develop Aug 14, 2023
@notoraptor notoraptor deleted the import-export-cli-and-webapi branch August 15, 2023 14:12
NeilGirdhar pushed a commit to NeilGirdhar/orion that referenced this pull request Oct 22, 2023
…pistimio#1082)

* Add files for scripts `orion db dump` and `orion db load`

* Write orion db dump

* Write orion db load for parameter --exp

* Simplify code to get an experiment from its name

* Wrap destination database into a storage.
Move dump logic into a function.

* dump:
- add and use CLI basic arguments
- use both version and name to find experiment to dump

* Rewrite load

* Hardcode collection list in dump

* Raise runtime error if resolve is 'bump' for benchmarks.

* - Add tests for orion db dump
- Raise DatabaseError when error occurs in the script
- If found many experiments for a given name, pick experiment with lower version

* - Add tests for orion db load
- raise DatabaseError when errors occur in th script
- If found many experiments for a given name, pick experiment with lower version

* Reformat code

* Move function dump_database() into new module orion.core.worker.storage_backup

* Add entry /dump to Web API.

* [web api] Add download suffix to dumped file

* Use one module for both import/export web API endpoints.

* [web api] Receive a POST request to import data

* Add function load_database into module storage_backup and move import logic into this function.

* Rename param `experiment` to `name` for function dump_database() to have same param names as load_database()

* Check conflicts before making import.

* [Web API]

Make load entry launch a process to manage import

Capture import progress messages printed by log

Add entry import-status to get import progress messages and status

* [web api] Allow to follow import progress using a callback in backend

* Add documentation for web API.

* Add documentation for command line.

* Add tests for web API /dump, /load and /import-status

* Fix tests.

* Move main functions to top of module storage_backup

* For dump from pickledb, use db object directly instead of locking database before (database is always locked before anything read/write/remove operation)

* Use storage instead of database for export.
- NB: With this new interface, dumping whole database is slower, because we must dump experiments one by one, making many more database calls.

Add heartbeat field to LockedAlgorithmState

Update BaseStorageProtocol interface:
- Allow to set initial algo state in create_experiment()
- Allow to set algo state in initialize_algorithm_lock(), and rename it to write_algorithm_lock()

Update test_db_dump
- WIth new dumping interface, only algos related to available experiments are dumped, dumped data does not contain algo related to unknown experiment anymore
- Check dumped algo state for an experiment

* Update doc in storage_resource
Update test for storage_resource

* For load (from pickledb file), use db object directly instead of locking database before (database is always locked before anything read/write/remove operation)

* Add benchmarks to tested databased for dump/load.

* Use storage instead of database to import.
BaseStorageProtocol:
- add new function delete_benchmark() and implement it in child class Legacy.

* Check progress callback messages in test_db_load:test_load_overwrite()

* Fix pylint

* Tru to set logging level in running import task

* Update docs/src/user/storage.rst

Co-authored-by: Xavier Bouthillier <[email protected]>

* Allow to not specify a resolve strategy. If not specified, an exception will be raised as soon as a conflict is detected.

* Just logging storage instead of `storage._db` in command lines `dump` and `load`.

* Storage export: add an option `--force` to explicitly overwrite dumped file if already exists.

* Import/export: if no specified, get latest instead of oldest version for specific experiment

* Use NamedTemporaryFile to generate temporary file in storage_resource.

* Rewrite docstring for class ImportTask in Numpy style.

* Fix a function name

* Rename test fixture used to test storage export.

* Add comment in test_dump_unknown_experiment() to explain why output file is created in any case.

* Remove unused logger in module storage_resource.

* Write generic checking functions for dump unit tests, that also verify expected number of children experiments or trials.

Discovering a corner case: imported/exported experiments keeps old `refers` links.
Currently remove refers links if only 1 specified experiment is dumped.

* Update TODO

* Regenerate experiment parent links in dst according to src when dumping, using correct dst IDs.

* Regenerate experiment parent links and experiment-to-trial links in dst according to src when loading, using correct dst IDs.

* Factorize tests in test_db_load and do not use prec-computed PKL files anymore (use pkl_* fixtures instead)

* Refactorize code for test_storage_resource.

* Use test_helpers for test_db_dump

* Remove useless tests/functional/commands/__init__.py
Add a test to check default storage content (we have more algos than expected)

* Check new data are ignored or indeed written when importing with ignore or overwrite resolution.

* Set and check trial links.

* Add tests/functional/conftest

* Regenerate links to root experiments in imported/exported experiments.

* Add common function to write experiment in a destination storage, to use for both dump and load features.

* Add a lock to ImportTask to prevent concurrent execution when updating task info.
Encapsulate ImportTask to make sure lock is used when necessary.

* Use Orion TreeNode to manage experiment and trial links.

* Try to fix CI failing tests. Tests seems to pass for python 3.8 but not for python 3.7.
- In python 3.8, we can clear previous logging handlers before setting new stream just by using new argument `force` in logging.basicConfig
- In python 3.7, we must clear previous handlers manually before setting new stream.

* Correctly update algorithm when setting deterministic experiment ID in tests/functional/commands/conftest

* Remove irrelevant calls to `logger.setLevel`

* Remove irrelevant calls to `logger.setLevel` that were added in this PR.

* [web api] make sure to transfer main root log level into import sub-process

* [test_storage_resource]
- Now that storage_resource logging depends on root logging, we must set client logging in unit tests to be sure expected logging messages are printed in import sub-process, using caplog.
- Using caplog, we can rewrite logging unit test using simulated client, instead of launching a real sub-process server.

* Clean dumped file if an error occurs when dumping, **except** if file already existed and *no* overwrite specified.

* Remove useless pylint in `orion.core.cli.db.load`

* Move module `orion.core.worker.storage_backup` into `orion.storage.backup`

* Remove fixed TODO

* Remove fixed TODO in test_db_load

* Clean dumped files only if no error occurs in storage_resource (files should have been deleted if an error occurred)

* Test dump using a temporary output path for most unit tests except `test_dump_default`

* - Work on temporary file when dumping and move it to output file only if no error occurred
- Move function _gen_host_file() from orion/serving/storage_resource to orion/core/utils and rename to generate_temporary_file().

* Use pytest fixture `tmp_path` instead of manually-created temp dir to test db dump.

---------

Co-authored-by: Xavier Bouthillier <[email protected]>
Co-authored-by: Setepenre <[email protected]>
NeilGirdhar pushed a commit to NeilGirdhar/orion that referenced this pull request Nov 16, 2023
…pistimio#1082)

* Add files for scripts `orion db dump` and `orion db load`

* Write orion db dump

* Write orion db load for parameter --exp

* Simplify code to get an experiment from its name

* Wrap destination database into a storage.
Move dump logic into a function.

* dump:
- add and use CLI basic arguments
- use both version and name to find experiment to dump

* Rewrite load

* Hardcode collection list in dump

* Raise runtime error if resolve is 'bump' for benchmarks.

* - Add tests for orion db dump
- Raise DatabaseError when error occurs in the script
- If found many experiments for a given name, pick experiment with lower version

* - Add tests for orion db load
- raise DatabaseError when errors occur in th script
- If found many experiments for a given name, pick experiment with lower version

* Reformat code

* Move function dump_database() into new module orion.core.worker.storage_backup

* Add entry /dump to Web API.

* [web api] Add download suffix to dumped file

* Use one module for both import/export web API endpoints.

* [web api] Receive a POST request to import data

* Add function load_database into module storage_backup and move import logic into this function.

* Rename param `experiment` to `name` for function dump_database() to have same param names as load_database()

* Check conflicts before making import.

* [Web API]

Make load entry launch a process to manage import

Capture import progress messages printed by log

Add entry import-status to get import progress messages and status

* [web api] Allow to follow import progress using a callback in backend

* Add documentation for web API.

* Add documentation for command line.

* Add tests for web API /dump, /load and /import-status

* Fix tests.

* Move main functions to top of module storage_backup

* For dump from pickledb, use db object directly instead of locking database before (database is always locked before anything read/write/remove operation)

* Use storage instead of database for export.
- NB: With this new interface, dumping whole database is slower, because we must dump experiments one by one, making many more database calls.

Add heartbeat field to LockedAlgorithmState

Update BaseStorageProtocol interface:
- Allow to set initial algo state in create_experiment()
- Allow to set algo state in initialize_algorithm_lock(), and rename it to write_algorithm_lock()

Update test_db_dump
- WIth new dumping interface, only algos related to available experiments are dumped, dumped data does not contain algo related to unknown experiment anymore
- Check dumped algo state for an experiment

* Update doc in storage_resource
Update test for storage_resource

* For load (from pickledb file), use db object directly instead of locking database before (database is always locked before anything read/write/remove operation)

* Add benchmarks to tested databased for dump/load.

* Use storage instead of database to import.
BaseStorageProtocol:
- add new function delete_benchmark() and implement it in child class Legacy.

* Check progress callback messages in test_db_load:test_load_overwrite()

* Fix pylint

* Tru to set logging level in running import task

* Update docs/src/user/storage.rst

Co-authored-by: Xavier Bouthillier <[email protected]>

* Allow to not specify a resolve strategy. If not specified, an exception will be raised as soon as a conflict is detected.

* Just logging storage instead of `storage._db` in command lines `dump` and `load`.

* Storage export: add an option `--force` to explicitly overwrite dumped file if already exists.

* Import/export: if no specified, get latest instead of oldest version for specific experiment

* Use NamedTemporaryFile to generate temporary file in storage_resource.

* Rewrite docstring for class ImportTask in Numpy style.

* Fix a function name

* Rename test fixture used to test storage export.

* Add comment in test_dump_unknown_experiment() to explain why output file is created in any case.

* Remove unused logger in module storage_resource.

* Write generic checking functions for dump unit tests, that also verify expected number of children experiments or trials.

Discovering a corner case: imported/exported experiments keeps old `refers` links.
Currently remove refers links if only 1 specified experiment is dumped.

* Update TODO

* Regenerate experiment parent links in dst according to src when dumping, using correct dst IDs.

* Regenerate experiment parent links and experiment-to-trial links in dst according to src when loading, using correct dst IDs.

* Factorize tests in test_db_load and do not use prec-computed PKL files anymore (use pkl_* fixtures instead)

* Refactorize code for test_storage_resource.

* Use test_helpers for test_db_dump

* Remove useless tests/functional/commands/__init__.py
Add a test to check default storage content (we have more algos than expected)

* Check new data are ignored or indeed written when importing with ignore or overwrite resolution.

* Set and check trial links.

* Add tests/functional/conftest

* Regenerate links to root experiments in imported/exported experiments.

* Add common function to write experiment in a destination storage, to use for both dump and load features.

* Add a lock to ImportTask to prevent concurrent execution when updating task info.
Encapsulate ImportTask to make sure lock is used when necessary.

* Use Orion TreeNode to manage experiment and trial links.

* Try to fix CI failing tests. Tests seems to pass for python 3.8 but not for python 3.7.
- In python 3.8, we can clear previous logging handlers before setting new stream just by using new argument `force` in logging.basicConfig
- In python 3.7, we must clear previous handlers manually before setting new stream.

* Correctly update algorithm when setting deterministic experiment ID in tests/functional/commands/conftest

* Remove irrelevant calls to `logger.setLevel`

* Remove irrelevant calls to `logger.setLevel` that were added in this PR.

* [web api] make sure to transfer main root log level into import sub-process

* [test_storage_resource]
- Now that storage_resource logging depends on root logging, we must set client logging in unit tests to be sure expected logging messages are printed in import sub-process, using caplog.
- Using caplog, we can rewrite logging unit test using simulated client, instead of launching a real sub-process server.

* Clean dumped file if an error occurs when dumping, **except** if file already existed and *no* overwrite specified.

* Remove useless pylint in `orion.core.cli.db.load`

* Move module `orion.core.worker.storage_backup` into `orion.storage.backup`

* Remove fixed TODO

* Remove fixed TODO in test_db_load

* Clean dumped files only if no error occurs in storage_resource (files should have been deleted if an error occurred)

* Test dump using a temporary output path for most unit tests except `test_dump_default`

* - Work on temporary file when dumping and move it to output file only if no error occurred
- Move function _gen_host_file() from orion/serving/storage_resource to orion/core/utils and rename to generate_temporary_file().

* Use pytest fixture `tmp_path` instead of manually-created temp dir to test db dump.

---------

Co-authored-by: Xavier Bouthillier <[email protected]>
Co-authored-by: Setepenre <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support storage checkpointing and restore
3 participants