-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use EVC tree trials in producer #347
Conversation
Why: It seams the EVC tree was not used inside the producer... Making the EVC somewhat useless beside traceability between experiments.
Why: The check against duplicates relies on the trial id which is a hash based on the experiment id and the params. Problem is, the trial of different experiment in the EVC have different experiment id, thus different hash for the same parameters. To avoid duplicates between two experiments in the EVC, we need to rely on a check before the registration of the trial. This can only work of there is no parallel workers on the 2 different experiment at the same time. This should be the case since experiments are typically run sequentially, and parallelism occurs within a single experiment. How: Keep track of hash of trials based on params only (including fidelity) inside the producer, and verify if a new trial already exist before registering it. DuplicateKeyError may still be raised during registration if 2 workers try to register the same trial simultaneously.
15030b0
to
f58b5fc
Compare
Codecov Report
@@ Coverage Diff @@
## develop #347 +/- ##
===========================================
- Coverage 43.61% 43.49% -0.13%
===========================================
Files 64 64
Lines 11351 11398 +47
Branches 271 271
===========================================
+ Hits 4951 4957 +6
- Misses 6375 6416 +41
Partials 25 25
Continue to review full report at Codecov.
|
IMO EVC & lying trials are adding too much subtleties in the HPO & Orion in general. |
At the moment, with only orion.algo.skopt that is sequential the parallel strategy isn't that useful indeed, but when we'll add more I think it may become worth it. Not sure though, I do find it annoying. For the EVC, it hasn't proven its value yet and is very annoying to maintain, but we are working on transfert learning techniques that may make it very useful. I'm not sure it's worth factoring out because we would need to maintain both and there will be many duplicates in the pipelines. I think the framework API that is coming soon with the python API will address part of these issues, as it will be independent of the EVC and allow using parts of Oríon separately. |
Use EVC tree trials in producer
Why:
It seams the EVC tree was not used inside the producer... Making the EVC
somewhat useless beside traceability between experiments.
Add a check to avoid duplicates across EVC
Why:
The check against duplicates relies on the trial id which is a hash
based on the experiment id and the params. Problem is, the trial of
different experiment in the EVC have different experiment id, thus
different hash for the same parameters. To avoid duplicates between two
experiments in the EVC, we need to rely on a check before the
registration of the trial. This can only work of there is no parallel
workers on the 2 different experiment at the same time. This should be
the case since experiments are typically run sequentially, and
parallelism occurs within a single experiment.
How:
Keep track of hash of trials based on params only (including fidelity)
inside the producer, and verify if a new trial already exist before
registering it. DuplicateKeyError may still be raised during
registration if 2 workers try to register the same trial simultaneously.