Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Remove MongoDB dependency from core tests #582

Merged
merged 26 commits into from
Apr 1, 2021

Conversation

breuleux
Copy link
Contributor

@breuleux breuleux commented Mar 30, 2021

Description

Progress towards #571

Changes

The tests have been fixed in the following files:

  • tests/functional/branching/test_branching.py
  • tests/functional/client/test_cli_client.py
  • tests/functional/commands/test_db_rm.py
  • tests/functional/commands/test_db_set.py
  • tests/functional/commands/test_info_command.py
  • tests/functional/commands/test_insert_command.py
  • tests/functional/commands/test_list_command.py
  • tests/functional/commands/test_setup_command.py
  • tests/functional/commands/test_status_command.py
  • tests/functional/demo/test_demo.py
  • tests/functional/parsing/test_parsing_base.py
  • tests/functional/parsing/test_parsing_hunt.py
  • tests/functional/parsing/test_parsing_insert.py
  • tests/unittests/core/cli/test_checks.py
  • tests/unittests/core/evc/test_resolutions.py
  • tests/unittests/core/evc/test_conflicts.py
  • tests/unittests/core/worker/test_consumer.py
  • tests/unittests/core/worker/test_experiment.py
  • tests/unittests/core/worker/test_producer.py
  • tests/unittests/core/worker/test_trial.py
  • tests/unittests/core/worker/test_trial_pacemaker.py
  • tests/unittests/core/worker/test_trials_history.py
  • tests/unittests/core/test_branch_config.py
  • tests/unittests/core/test_insert.py

I added the storage and pdatabase fixtures in conftest.py to provide the empty PickledDB storage (since setup_pickleddb_database does not yield it).

  • test_conflicts.py::TestExperimentNameConflict.test_conflict_exp_w_child_as_parent still fails, possibly because of _id (was a string in MongoDB, not in PickledDB).
  • test_producer.py is giving me problems, namely orion.core.utils.exceptions.NoConfigurationError: Experiment supernaedo2-dendi does not exist in DB and space was not defined.

@@ -107,7 +107,7 @@ def define_database_config(config):
database_config.add_option(
"type",
option_type=str,
default="MongoDB",
default="PickledDB",
env_var="ORION_DB_TYPE",
help=(
"Type of database. Builtin backends are ``mongodb``, "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to change the host below as well. You should set it to default host for pickleddb: https://github.com/Epistimio/orion/blob/develop/src/orion/core/io/database/pickleddb.py#L25

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Shouldn't the default here depend on the actual value of the type option? I mean, if type is set to mongodb then host should be localhost by default, not PickledDB's DEFAULT_HOST.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes, but Configuration does not support conditional defaults like this for the moment. :(

Copy link
Contributor Author

@breuleux breuleux Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default could be None or "", though, and then the storage's constructor can set it to its default if it sees the host argument is null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

@@ -12,7 +12,7 @@ def execute(command, assert_code=0):
assert returncode == assert_code


def test_no_exp(clean_db, capsys):
def test_no_exp(pdatabase, capsys):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_no_exp(pdatabase, capsys):
def test_no_exp(setup_pickleddb_database, capsys):

@@ -12,7 +12,7 @@ def execute(command, assert_code=0):
assert returncode == assert_code


def test_no_exp(clean_db, capsys):
def test_no_exp(pdatabase, capsys):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_no_exp(pdatabase, capsys):
def test_no_exp(setup_pickleddb_database, capsys):


assert len(exp) == 1
exp = exp[0]
assert "_id" in exp

trials = list(database.trials.find({"experiment": exp["_id"]}))
trials = list(storage._fetch_trials({"experiment": exp["_id"]}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
trials = list(storage._fetch_trials({"experiment": exp["_id"]}))
trials = list(storage.fetch_trials(uid=exp["_id"]))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On pourrait aussi faire storage.fetch_trials(experiment=exp), right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pas si c'est un dict.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, true. But why does fetch_experiments return dicts rather than Experiment objects? fetch_trials does return Trial objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is inconsistent. It is because we need to make more complex operations after fetching the experiment config in order to instantiate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see. What kind of operations?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We fetch the experiment config, then compare the config from DB with the config passed by user and detect if there is any modification. If there is we branch and create a new experiment that we register in the database.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are planning to divide the experiment class into 2, one being a data container and the other being the client object giving access to operations with the database (like selecting trials, creating trials). When this is done we could make fetch_experiment returns the experiment data containers, just like it is done for the trials.

@@ -139,10 +140,12 @@ def test_update_and_produce(producer, database, random_dt):
assert producer.naive_algorithm.algorithm._suggested == possible_values


def test_register_new_trials(producer, database, random_dt):
def test_register_new_trials(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not storage fixture instead of get_storage()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... there's a weird bug about this. If I use the storage fixture, the test fails, because the storage is empty, a bit as if the fixture was run a second time after hacked_exp, but I tried to check and I don't think it is? Either way I suspect fixture shenanigans and I didn't feel like debugging that, but I could look into it if you want, after everything else is fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's see if time permits, otherwise we keep it that way.

query = {"status": {"$ne": "completed"}, "experiment": producer.experiment.id}
database.trials.remove(query)
trials_in_db_before = database.trials.count({"experiment": producer.experiment.id})
storage = get_storage()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

assert trials_in_db_before == 3

producer.update()

assert len(produce_lies(producer)) == 0


def test_lies_generation(producer, database, random_dt):
def test_lies_generation(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem



def test_register_lies(producer, database, random_dt):
def test_register_lies(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem



def test_register_duplicate_lies(producer, database, random_dt):
def test_register_duplicate_lies(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

assert len(lying_trials) == 5


def test_register_duplicate_lies_with_different_results(producer, database, random_dt):
def test_register_duplicate_lies_with_different_results(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

def test_naive_algo_not_trained_when_all_trials_completed(
producer, database, random_dt
):
def test_naive_algo_not_trained_when_all_trials_completed(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

@@ -336,28 +338,27 @@ def test_naive_algo_not_trained_when_all_trials_completed(
assert len(producer.naive_algorithm.algorithm._points) == 3


def test_naive_algo_trained_on_all_non_completed_trials(producer, database, random_dt):
def test_naive_algo_trained_on_all_non_completed_trials(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

@@ -395,10 +396,12 @@ def test_naive_algo_is_discared(producer, database, monkeypatch):
assert len(producer.naive_algorithm.algorithm._points) == (3 + 5)


def test_concurent_producers(producer, database, random_dt):
def test_concurent_producers(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

}


def test_duplicate_within_pool(producer, random_dt):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

"""Test that an algo suggesting multiple points can have a few registered even
if one of them is a duplicate with db.
"""
trials_in_db_before = database.trials.count()
new_trials_in_db_before = database.trials.count({"status": "new"})
storage = get_storage()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

"""Test that producer stops producing when algo is done."""
storage = get_storage()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem

parent_id = None
update = {"refers.root_id": root_id, "refers.parent_id": parent_id}
db.write("experiments", update, query)
def hacked_exp(with_user_dendi, random_dt, storage):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see what the issue is. You do not need storage if you use the contextmanager OrionState because the latter is creating a new storage object. But since you are returning the exp in this fixture, you don't have access directly to the storage in the tests. A solution would be to avoid OrionState and do storage.create_experiment(dendi_exp_config) with storage._db.write('trials', dendi_base_trials).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, okay, that makes sense. I will do what you describe so that the fixtures are coherent. Thanks!

@bouthilx bouthilx merged commit 13b6bd8 into Epistimio:develop Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants