-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADD] Robustly refit models in final ensemble in parallel #471
Conversation
Codecov Report
@@ Coverage Diff @@
## development #471 +/- ##
================================================
+ Coverage 64.65% 85.23% +20.58%
================================================
Files 231 232 +1
Lines 16304 16456 +152
Branches 3009 3048 +39
================================================
+ Hits 10542 14027 +3485
+ Misses 4714 1578 -3136
+ Partials 1048 851 -197
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi thanks for the PR.
I think the changes made the codebase looks better.
I added some minor comments.
autoPyTorch/api/base_task.py
Outdated
if old_identifier_index is not None: | ||
replace_old_identifiers_to_refit_identifiers[list(self.models_.keys())[old_identifier_index]] = refit_identifier | ||
else: | ||
self._logger.warning(f"Refit for {config} failed. Updating ensemble weights accordingly.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still update the ensemble weights?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I have fixed it
Thanks for the PR. Looks good to me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the work!
I checked your changes and approved them:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. I'm just leaving some minor comments.
autoPyTorch/api/base_task.py
Outdated
metric=self._metric, | ||
dask_client=self._dask_client, | ||
backend=self._backend, | ||
memory_limit=self._memory_limit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this use the memory limit populated above at lines 807-809:
memory_limit=self._memory_limit, | |
memory_limit=memory_limit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out. I have fixed it now
temporary_directory='./tmp/autoPyTorch_example_tmp_01', | ||
output_directory='./tmp/autoPyTorch_example_out_01', | ||
delete_tmp_folder_after_terminate=False, | ||
delete_output_folder_after_terminate=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If uncomment was on purpose l'd suggest we remove the lines above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it was an artefact of debugging, I have fixed it now. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. Looks good now.
self.config["early_stopping_rounds"] = early_stopping | ||
|
||
if self.has_val_set: | ||
early_stopping = 150 if X_train.shape[0] > 10000 else max(round(150 * 10000 / X_train.shape[0]), 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't early stopping if self.has_val_set
is set as False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we can't as we rely on external libraries to implement this for us
@@ -21,7 +21,7 @@ | |||
# noinspection PyInterpreter | |||
setuptools.setup( | |||
name="autoPyTorch", | |||
version="0.2", | |||
version="0.2.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should also update the version in __version__.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's exactly my commit as well :P. I made the change but didn't add it in the previous commit.
Looks good. I'll proceed with merging this PR. |
* [FIX] Documentation and docker workflow file (#449) * fixes to documentation and docker * fix to docker * Apply suggestions from code review * add change log for release (#450) * [FIX] release docs (#452) * Release 0.2 * Release 0.2.0 * fix docs new line * [FIX] ADD forecasting init design to pip data files (#459) * add forecasting_init.json to data files under setup * avoid undefined reference in scale_value * checks for time series dataset split (#464) * checks for time series dataset split * maint * Update autoPyTorch/datasets/time_series_dataset.py Co-authored-by: Ravin Kohli <[email protected]> Co-authored-by: Ravin Kohli <[email protected]> * [FIX] Numerical stability scaling for timeseries forecasting tasks (#467) * resolve rebase conflict * add checks for scaling factors * flake8 fix * resolve conflict * [FIX] pipeline options in `fit_pipeline` (#466) * fix update of pipeline config options in fit pipeline * fix flake and test * suggestions from review * [FIX] results management and visualisation with missing test data (#465) * add flexibility to avoid checking for test scores * fix flake and test * fix bug in tests * suggestions from review * [ADD] Robustly refit models in final ensemble in parallel (#471) * add parallel model runner and update running traditional classifiers * update pipeline config to pipeline options * working refit function * fix mypy and flake * suggestions from review * fix mypy and flake * suggestions from review * finish documentation * fix tests * add test for parallel model runner * fix flake * fix tests * fix traditional prediction for refit * suggestions from review * add warning for failed processing of results * remove unnecessary change * update autopytorch version number * update autopytorch version number and the example file * [DOCS] Release notes v0.2.1 (#476) * Release 0.2.1 * add release docs * Update docs/releases.rst Co-authored-by: Difan Deng <[email protected]>
Similar to
fit_pipeline
,refit
function now runs the models found in the final ensemble in parallel using dask. It is also robust to failures while refitting where it reuses the original model instead.Types of changes
Note that a Pull Request should only contain one of refactoring, new features or documentation changes.
Please separate these changes and send us individual PRs for each.
For more information on how to create a good pull request, please refer to The anatomy of a perfect pull request.
Checklist:
Description
To enable catching errors and adding constraints, I have used the
ExecuteTAEFuncWithQueue
class. As the code for training models in parallel is also used for running the traditional models, I have created arun_models_on_dataset
function which encapsulates this functionality.Motivation and Context
Refit currently, only runs all the models sequentially and fails if any one of the models to be refitted fails. Moreover, there is no way to limit the time and the memory used for the refit. With this PR, I have added the regular
TAE
which is used for search and other model fittings, which allow us to gracefully exit when a refit fails as well as add the relevant constraints.This PR fixes #469.
How has this been tested?
I have added a test for run_models_on_dataset which ensures that at least one of the 5 random configs is successful. I have also extended the test for tabular classification, to verify that refit works as expected, i.e, the ensemble is updated.