checks for time series dataset split #464

dengdifan · 2022-08-08T10:30:08Z

Types of changes

Breaking change (fix or feature that would cause existing functionality to not work as expected)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

Note that a Pull Request should only contain one of refactoring, new features or documentation changes.
Please separate these changes and send us individual PRs for each.
For more information on how to create a good pull request, please refer to The anatomy of a perfect pull request.

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.

Have you checked to ensure there aren't other open Pull Requests for the same update/change?
Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your core changes, as applicable?
Have you successfully ran tests with your changes locally?

Description

In the extreme case, the training set might no longer exist: #461. This PR adds an additional check to remove the invalid splits and raises an error if n_prediction_steps does not fit the current dataset.

Motivation and Context

How has this been tested?

codecov · 2022-08-08T11:13:43Z

Codecov Report

Merging #464 (62757b1) into development (c7220f7) will increase coverage by 20.83%.
The diff coverage is 100.00%.

@@               Coverage Diff                @@
##           development     #464       +/-   ##
================================================
+ Coverage        64.65%   85.49%   +20.83%     
================================================
  Files              231      231               
  Lines            16304    16311        +7     
  Branches          3009     3012        +3     
================================================
+ Hits             10542    13945     +3403     
+ Misses            4714     1528     -3186     
+ Partials          1048      838      -210

Impacted Files	Coverage Δ
autoPyTorch/datasets/time_series_dataset.py	`90.63% <100.00%> (+28.73%)`	⬆️
...bone/forecasting_encoder/seq_encoder/TCNEncoder.py	`96.46% <0.00%> (+1.76%)`	⬆️
autoPyTorch/ensemble/ensemble_selection.py	`96.87% <0.00%> (+2.08%)`	⬆️
...nts/setup/network_backbone/ShapedResNetBackbone.py	`100.00% <0.00%> (+2.08%)`	⬆️
autoPyTorch/evaluation/utils.py	`73.61% <0.00%> (+2.77%)`	⬆️
...peline/components/training/trainer/MixUpTrainer.py	`97.14% <0.00%> (+2.85%)`	⬆️
...nts/setup/early_preprocessor/EarlyPreprocessing.py	`85.71% <0.00%> (+2.85%)`	⬆️
autoPyTorch/api/base_task.py	`83.81% <0.00%> (+2.87%)`	⬆️
...ar_preprocessing/feature_preprocessing/Nystroem.py	`91.17% <0.00%> (+2.94%)`	⬆️
...casting_backbone/forecasting_decoder/RNNDecoder.py	`91.66% <0.00%> (+3.33%)`	⬆️
... and 142 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

autoPyTorch/datasets/time_series_dataset.py

ravinkohli

Thanks for the PR. I have a very minor suggestion, otherwise, this PR can be merged.

Co-authored-by: Ravin Kohli <[email protected]>

* [FIX] Documentation and docker workflow file (#449) * fixes to documentation and docker * fix to docker * Apply suggestions from code review * add change log for release (#450) * [FIX] release docs (#452) * Release 0.2 * Release 0.2.0 * fix docs new line * [FIX] ADD forecasting init design to pip data files (#459) * add forecasting_init.json to data files under setup * avoid undefined reference in scale_value * checks for time series dataset split (#464) * checks for time series dataset split * maint * Update autoPyTorch/datasets/time_series_dataset.py Co-authored-by: Ravin Kohli <[email protected]> Co-authored-by: Ravin Kohli <[email protected]> * [FIX] Numerical stability scaling for timeseries forecasting tasks (#467) * resolve rebase conflict * add checks for scaling factors * flake8 fix * resolve conflict * [FIX] pipeline options in `fit_pipeline` (#466) * fix update of pipeline config options in fit pipeline * fix flake and test * suggestions from review * [FIX] results management and visualisation with missing test data (#465) * add flexibility to avoid checking for test scores * fix flake and test * fix bug in tests * suggestions from review * [ADD] Robustly refit models in final ensemble in parallel (#471) * add parallel model runner and update running traditional classifiers * update pipeline config to pipeline options * working refit function * fix mypy and flake * suggestions from review * fix mypy and flake * suggestions from review * finish documentation * fix tests * add test for parallel model runner * fix flake * fix tests * fix traditional prediction for refit * suggestions from review * add warning for failed processing of results * remove unnecessary change * update autopytorch version number * update autopytorch version number and the example file * [DOCS] Release notes v0.2.1 (#476) * Release 0.2.1 * add release docs * Update docs/releases.rst Co-authored-by: Difan Deng <[email protected]>

checks for time series dataset split

93a923e

dengdifan requested a review from ravinkohli August 8, 2022 10:30

maint

ee2ff5c

ravinkohli reviewed Aug 8, 2022

View reviewed changes

autoPyTorch/datasets/time_series_dataset.py Outdated Show resolved Hide resolved

ravinkohli requested changes Aug 8, 2022

View reviewed changes

Update autoPyTorch/datasets/time_series_dataset.py

62757b1

Co-authored-by: Ravin Kohli <[email protected]>

ravinkohli merged commit faa1efd into automl:development Aug 9, 2022

ravinkohli linked an issue Aug 9, 2022 that may be closed by this pull request

A forecast horizon of 5 leads to "ValueError: 'a' cannot be empty unless no samples are taken" in time-series forecast example #461

Closed

1 task

github-actions bot pushed a commit that referenced this pull request Aug 9, 2022

Difan Deng: checks for time series dataset split (#464)

a5861b2

ravinkohli mentioned this pull request Aug 23, 2022

[RELEASE] v0.2.1 #475

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checks for time series dataset split #464

checks for time series dataset split #464

dengdifan commented Aug 8, 2022

codecov bot commented Aug 8, 2022 •

edited

Loading

ravinkohli left a comment

checks for time series dataset split #464

checks for time series dataset split #464

Conversation

dengdifan commented Aug 8, 2022

Types of changes

Checklist:

Description

Motivation and Context

How has this been tested?

codecov bot commented Aug 8, 2022 • edited Loading

Codecov Report

ravinkohli left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 8, 2022 •

edited

Loading