-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Remove MongoDB dependency from core tests #582
Conversation
src/orion/core/__init__.py
Outdated
@@ -107,7 +107,7 @@ def define_database_config(config): | |||
database_config.add_option( | |||
"type", | |||
option_type=str, | |||
default="MongoDB", | |||
default="PickledDB", | |||
env_var="ORION_DB_TYPE", | |||
help=( | |||
"Type of database. Builtin backends are ``mongodb``, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to change the host below as well. You should set it to default host for pickleddb: https://github.com/Epistimio/orion/blob/develop/src/orion/core/io/database/pickleddb.py#L25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Shouldn't the default here depend on the actual value of the type
option? I mean, if type
is set to mongodb
then host
should be localhost
by default, not PickledDB's DEFAULT_HOST
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally yes, but Configuration
does not support conditional defaults like this for the moment. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default could be None or "", though, and then the storage's constructor can set it to its default if it sees the host argument is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
@@ -12,7 +12,7 @@ def execute(command, assert_code=0): | |||
assert returncode == assert_code | |||
|
|||
|
|||
def test_no_exp(clean_db, capsys): | |||
def test_no_exp(pdatabase, capsys): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_no_exp(pdatabase, capsys): | |
def test_no_exp(setup_pickleddb_database, capsys): |
@@ -12,7 +12,7 @@ def execute(command, assert_code=0): | |||
assert returncode == assert_code | |||
|
|||
|
|||
def test_no_exp(clean_db, capsys): | |||
def test_no_exp(pdatabase, capsys): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_no_exp(pdatabase, capsys): | |
def test_no_exp(setup_pickleddb_database, capsys): |
|
||
assert len(exp) == 1 | ||
exp = exp[0] | ||
assert "_id" in exp | ||
|
||
trials = list(database.trials.find({"experiment": exp["_id"]})) | ||
trials = list(storage._fetch_trials({"experiment": exp["_id"]})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trials = list(storage._fetch_trials({"experiment": exp["_id"]})) | |
trials = list(storage.fetch_trials(uid=exp["_id"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On pourrait aussi faire storage.fetch_trials(experiment=exp)
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pas si c'est un dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, true. But why does fetch_experiments
return dicts rather than Experiment
objects? fetch_trials
does return Trial
objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is inconsistent. It is because we need to make more complex operations after fetching the experiment config in order to instantiate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see. What kind of operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We fetch the experiment config, then compare the config from DB with the config passed by user and detect if there is any modification. If there is we branch and create a new experiment that we register in the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are planning to divide the experiment class into 2, one being a data container and the other being the client object giving access to operations with the database (like selecting trials, creating trials). When this is done we could make fetch_experiment returns the experiment data containers, just like it is done for the trials.
@@ -139,10 +140,12 @@ def test_update_and_produce(producer, database, random_dt): | |||
assert producer.naive_algorithm.algorithm._suggested == possible_values | |||
|
|||
|
|||
def test_register_new_trials(producer, database, random_dt): | |||
def test_register_new_trials(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not storage fixture instead of get_storage()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... there's a weird bug about this. If I use the storage
fixture, the test fails, because the storage is empty, a bit as if the fixture was run a second time after hacked_exp
, but I tried to check and I don't think it is? Either way I suspect fixture shenanigans and I didn't feel like debugging that, but I could look into it if you want, after everything else is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let's see if time permits, otherwise we keep it that way.
query = {"status": {"$ne": "completed"}, "experiment": producer.experiment.id} | ||
database.trials.remove(query) | ||
trials_in_db_before = database.trials.count({"experiment": producer.experiment.id}) | ||
storage = get_storage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
assert trials_in_db_before == 3 | ||
|
||
producer.update() | ||
|
||
assert len(produce_lies(producer)) == 0 | ||
|
||
|
||
def test_lies_generation(producer, database, random_dt): | ||
def test_lies_generation(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
|
||
|
||
def test_register_lies(producer, database, random_dt): | ||
def test_register_lies(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
|
||
|
||
def test_register_duplicate_lies(producer, database, random_dt): | ||
def test_register_duplicate_lies(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
assert len(lying_trials) == 5 | ||
|
||
|
||
def test_register_duplicate_lies_with_different_results(producer, database, random_dt): | ||
def test_register_duplicate_lies_with_different_results(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
def test_naive_algo_not_trained_when_all_trials_completed( | ||
producer, database, random_dt | ||
): | ||
def test_naive_algo_not_trained_when_all_trials_completed(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
@@ -336,28 +338,27 @@ def test_naive_algo_not_trained_when_all_trials_completed( | |||
assert len(producer.naive_algorithm.algorithm._points) == 3 | |||
|
|||
|
|||
def test_naive_algo_trained_on_all_non_completed_trials(producer, database, random_dt): | |||
def test_naive_algo_trained_on_all_non_completed_trials(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
@@ -395,10 +396,12 @@ def test_naive_algo_is_discared(producer, database, monkeypatch): | |||
assert len(producer.naive_algorithm.algorithm._points) == (3 + 5) | |||
|
|||
|
|||
def test_concurent_producers(producer, database, random_dt): | |||
def test_concurent_producers(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
} | ||
|
||
|
||
def test_duplicate_within_pool(producer, random_dt): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
"""Test that an algo suggesting multiple points can have a few registered even | ||
if one of them is a duplicate with db. | ||
""" | ||
trials_in_db_before = database.trials.count() | ||
new_trials_in_db_before = database.trials.count({"status": "new"}) | ||
storage = get_storage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
"""Test that producer stops producing when algo is done.""" | ||
storage = get_storage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem
parent_id = None | ||
update = {"refers.root_id": root_id, "refers.parent_id": parent_id} | ||
db.write("experiments", update, query) | ||
def hacked_exp(with_user_dendi, random_dt, storage): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see what the issue is. You do not need storage
if you use the contextmanager OrionState because the latter is creating a new storage object. But since you are returning the exp in this fixture, you don't have access directly to the storage in the tests. A solution would be to avoid OrionState and do storage.create_experiment(dendi_exp_config)
with storage._db.write('trials', dendi_base_trials)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, okay, that makes sense. I will do what you describe so that the fixtures are coherent. Thanks!
Description
Progress towards #571
Changes
The tests have been fixed in the following files:
I added the
storage
andpdatabase
fixtures inconftest.py
to provide the empty PickledDB storage (sincesetup_pickleddb_database
does not yield it).test_conflicts.py::TestExperimentNameConflict.test_conflict_exp_w_child_as_parent
still fails, possibly because of_id
(was a string in MongoDB, not in PickledDB).test_producer.py
is giving me problems, namelyorion.core.utils.exceptions.NoConfigurationError: Experiment supernaedo2-dendi does not exist in DB and space was not defined
.