Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Rotbaum to handle short series #3065

Closed
wants to merge 9 commits into from
Closed

Conversation

leica2023
Copy link
Contributor

Issue #, if available:

Description of changes:
for items with less than forecast_horizon historical steps, jobs failed due to error 'Number of features of the model must match the input. Model n_features_ is 88 and input n_features is 60'. This PR fixes this error.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Please tag this pr with at least one of these labels to make our release process faster: BREAKING, new feature, bug fix, other change, dev setup

@lostella
Copy link
Contributor

lostella commented Nov 28, 2023

Thanks @leica2023! I think it would make sense to add a small test that reproduces the issue, and that would fail before this fix.

More at a high level: in this PR, the make_features method is being fixed to add padding whenever target is too short. This method appears to only be invoked a prediction time: is this going to work as intended? Will the model be trained to understand how to treat those padded series?

@lostella
Copy link
Contributor

Also fix the styling issue by applying black

@@ -454,7 +454,7 @@ def make_features(self, time_series: Dict, starting_index: int) -> List:
prefix = [None] * abs(starting_index)
else:
prefix = []
time_series_window = time_series["target"][starting_index:end_index]
time_series_window = time_series["target"] if prefix else time_series["target"][starting_index:end_index]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would include this in the previous if else above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense.

@leica2023
Copy link
Contributor Author

Thanks @leica2023! I think it would make sense to add a small test that reproduces the issue, and that would fail before this fix.

More at a high level: in this PR, the make_features method is being fixed to add padding whenever target is too short. This method appears to only be invoked a prediction time: is this going to work as intended? Will the model be trained to understand how to treat those padded series?

Correct. The issue happens at prediction time for instances that have history length shorter than context_length. This method is also used in training data preparation, but this part ensures ts is long enough to contribute to training samples.

@lostella
Copy link
Contributor

@leica2023 need to also format the test scripts:

black src test

@@ -45,7 +45,7 @@
"import os\n",
"import pprint\n",
"import random\n",
"from scipy import stats \n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leica2023 could you revert the changes to notebooks? No need to run black on them

freq=freq,
)

predictor = TreePredictor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TreePredictor also needs importing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this and reverted ipynb styling.

@@ -59,3 +61,69 @@ def test_rotbaum_smoke(datasets):
predictor = estimator.train(dataset_train)
forecasts = list(predictor.predict(dataset_test))
assert len(forecasts) == len(dataset_test)


def test_short_history_item_pred():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leica2023 would this test fail before the fix contained in the PR? (That is, would it fail if run on the current dev branch)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct.

@lostella lostella changed the title fixed bug for items with short than H history Fix Rotbaum to handle short series Nov 30, 2023
@lostella lostella added the bug fix (one of pr required labels) label Nov 30, 2023
@lostella lostella added pending v0.13.x backport This contains a fix to be backported to the v0.13.x branch pending v0.14.x backport This contains a fix to be backported to the v0.14.x branch labels Dec 4, 2023
@lostella lostella removed pending v0.13.x backport This contains a fix to be backported to the v0.13.x branch pending v0.14.x backport This contains a fix to be backported to the v0.14.x branch labels Dec 5, 2023
@lostella lostella closed this Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix (one of pr required labels)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants