-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hyperparameter mutation probabilities and gradual changes #103
Conversation
One idea could be to merge the Trial class and the NodeLabel class. That may simplify the code. And it would be consistent with the idea/plan of the graphindividual graph holding objects that each have their own mutation/crossover methods. |
node.hyperparameters = self.select_config_dict(node)[node.method_class](config.hyperparametersuggestor) | ||
|
||
|
||
if not completed_one: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the else
part of this if
ever get hit, given that completed_one is always set to False at the start?
Maybe just use the hyper_node_probability
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, just fixed it.
The completed_one is to guarantee that at least one node has its hyperparameters mutated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The else
would still not get hit because of the return True
statement's placement right? If the plan is just to mutate one node then this works fine, but if you want to have a probability of each node being mutated, this won't accomplish that.
_mutate_hyperparameters should return True only if hyperparametesr were actually changed. If the first node happens to be one with a fixed set of hyperparameters, then it would return True without doing anything, leading to a duplicate individual. This is eventually caught later in the population class which loops through mutations until the individual is unique, but it might be a good idea to catch this here too. If a user wants to allow for repeat individuals, those should probably be due to evolution finding the same solution a second time rather than a mutation function that doesn't do anything. |
Could we not just add a |
that would work |
Some of the checks have failed. After looking through the tox logs, it looks like the error happened on The following change to that line should fix those errors: |
There is another failed check happening at
My intuition is telling me that we are passing in an old_params dictionary that is empty and the |
[please review the Contribution Guidelines prior to submitting your pull request. go ahead and delete this line if you've already reviewed said guidelines.]
What does this PR do?
Added three parameters to GraphIndividual (and to the Estimator) to better control the probabilities of hyperparameter mutations.
hyperparameter_probability :
float from 0 to 1. The percent of hyperparameters that get mutated per node. (At least 1 hyperparameter will be updated)hyper_node_probability
: float from 0 to 1. The percent of nodes that get their hyperparameters updated. (at least 1 node will be updated)hyperparameter_alpha
: float from 0 to 1. used to calculate a weighted average between the new hyperparameter and the old one. A value of 1 means that the new value is selected. (new x alpha + old x (1-alpha).The config.hyperparameter file used to have separate functions. These have been grouped into a Trial class. This makes the code easier to read. It also allows for the features above to be implemented without changing the optuna compatible API. Furthermore, now the individual nodes store the optuna suggested hyperparameter in addition to the final hyperparameters returned by the param function (these are not necessarily identical).
Any background context you want to provide?
This will make it easier to better specify probabilities for hyperparameter changes. The inclusion of the alpha parameter allows for gradual changes in the hyperparameter which may potentially make them easier to learn, this needs to be investigated.
What are the relevant issues?
This may be helpful for #84