Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use EVC tree trials in producer #347

Merged
merged 2 commits into from
Mar 3, 2020

Conversation

bouthilx
Copy link
Member

@bouthilx bouthilx commented Mar 3, 2020

Use EVC tree trials in producer

Why:

It seams the EVC tree was not used inside the producer... Making the EVC
somewhat useless beside traceability between experiments.

Add a check to avoid duplicates across EVC

Why:

The check against duplicates relies on the trial id which is a hash
based on the experiment id and the params. Problem is, the trial of
different experiment in the EVC have different experiment id, thus
different hash for the same parameters. To avoid duplicates between two
experiments in the EVC, we need to rely on a check before the
registration of the trial. This can only work of there is no parallel
workers on the 2 different experiment at the same time. This should be
the case since experiments are typically run sequentially, and
parallelism occurs within a single experiment.

How:

Keep track of hash of trials based on params only (including fidelity)
inside the producer, and verify if a new trial already exist before
registering it. DuplicateKeyError may still be raised during
registration if 2 workers try to register the same trial simultaneously.

Why:

It seams the EVC tree was not used inside the producer... Making the EVC
somewhat useless beside traceability between experiments.
@bouthilx bouthilx requested a review from Delaunay March 3, 2020 03:26
@bouthilx
Copy link
Member Author

bouthilx commented Mar 3, 2020

Why:

The check against duplicates relies on the trial id which is a hash
based on the experiment id and the params. Problem is, the trial of
different experiment in the EVC have different experiment id, thus
different hash for the same parameters. To avoid duplicates between two
experiments in the EVC, we need to rely on a check before the
registration of the trial. This can only work of there is no parallel
workers on the 2 different experiment at the same time. This should be
the case since experiments are typically run sequentially, and
parallelism occurs within a single experiment.

How:

Keep track of hash of trials based on params only (including fidelity)
inside the producer, and verify if a new trial already exist before
registering it. DuplicateKeyError may still be raised during
registration if 2 workers try to register the same trial simultaneously.
@bouthilx bouthilx force-pushed the fix/use_evc_in_producer branch from 15030b0 to f58b5fc Compare March 3, 2020 04:31
@codecov-io
Copy link

Codecov Report

Merging #347 into develop will decrease coverage by 0.12%.
The diff coverage is 12.76%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #347      +/-   ##
===========================================
- Coverage    43.61%   43.49%   -0.13%     
===========================================
  Files           64       64              
  Lines        11351    11398      +47     
  Branches       271      271              
===========================================
+ Hits          4951     4957       +6     
- Misses        6375     6416      +41     
  Partials        25       25
Impacted Files Coverage Δ
tests/unittests/core/test_trial.py 18.68% <12.5%> (-0.6%) ⬇️
tests/unittests/core/worker/test_producer.py 10.92% <12.9%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c573d5...f58b5fc. Read the comment docs.

@Delaunay
Copy link
Collaborator

Delaunay commented Mar 3, 2020

IMO EVC & lying trials are adding too much subtleties in the HPO & Orion in general.
I think EVC should be factored out & lying trials replaced by something less hacky.

@bouthilx
Copy link
Member Author

bouthilx commented Mar 3, 2020

At the moment, with only orion.algo.skopt that is sequential the parallel strategy isn't that useful indeed, but when we'll add more I think it may become worth it. Not sure though, I do find it annoying.

For the EVC, it hasn't proven its value yet and is very annoying to maintain, but we are working on transfert learning techniques that may make it very useful. I'm not sure it's worth factoring out because we would need to maintain both and there will be many duplicates in the pipelines. I think the framework API that is coming soon with the python API will address part of these issues, as it will be independent of the EVC and allow using parts of Oríon separately.

@bouthilx bouthilx merged commit c055304 into Epistimio:develop Mar 3, 2020
@bouthilx bouthilx added the v0.1.8 label Mar 8, 2020
This was referenced Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants