-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model_builder adaptations for full sklearn compat #189
model_builder adaptations for full sklearn compat #189
Conversation
also in order to simplify the code, now the model_config and sampler_config can be passed only at the class initialization, other approaches where they could be passed to build model required constant checks and was polluting the code, without clear benefits. |
adapting linearmodel and it's tests to use sklearn-only approach
08a75fc
to
2cfa0cb
Compare
pymc_experimental/model_builder.py
Outdated
X: Union[np.ndarray, pd.DataFrame, pd.Series], | ||
y: Union[np.ndarray, pd.Series], | ||
X: pd.DataFrame, | ||
y: pd.Series, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make y
be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, it will introduce some complications in child classes, as then we're technically allowing to build with y being None, and DelayedSaturatedMMM is build assuming that it's there. In general it will add probably quite few if's in the code but it shouldn't be a big deal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that if we don't do that it will limit the models we can fit this on, as not all actually have an X and a y, for example, the change-point model only has a single time-series.
Adaptations for full sklearn compatibility:
All functions expect data to be passed around in X,y format known from sklearn
output_var now is a property, which allows to use it in loading functions (still save saves entire original dataset - concatenated X and y, so this was needed to be able to always tell which column is a target column)
fit now requires X to be pd.DataFrame and y to be pd.Series