Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model_builder adaptations for full sklearn compat #189

Merged
merged 4 commits into from
Jun 8, 2023

Conversation

michaelraczycki
Copy link
Collaborator

Adaptations for full sklearn compatibility:
All functions expect data to be passed around in X,y format known from sklearn
output_var now is a property, which allows to use it in loading functions (still save saves entire original dataset - concatenated X and y, so this was needed to be able to always tell which column is a target column)
fit now requires X to be pd.DataFrame and y to be pd.Series

@michaelraczycki
Copy link
Collaborator Author

also in order to simplify the code, now the model_config and sampler_config can be passed only at the class initialization, other approaches where they could be passed to build model required constant checks and was polluting the code, without clear benefits.

X: Union[np.ndarray, pd.DataFrame, pd.Series],
y: Union[np.ndarray, pd.Series],
X: pd.DataFrame,
y: pd.Series,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make y be optional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, it will introduce some complications in child classes, as then we're technically allowing to build with y being None, and DelayedSaturatedMMM is build assuming that it's there. In general it will add probably quite few if's in the code but it shouldn't be a big deal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that if we don't do that it will limit the models we can fit this on, as not all actually have an X and a y, for example, the change-point model only has a single time-series.

@michaelraczycki michaelraczycki requested a review from twiecki June 8, 2023 08:58
@michaelraczycki michaelraczycki merged commit a7c4d79 into pymc-devs:main Jun 8, 2023
@michaelraczycki michaelraczycki deleted the sklearn_adaptations branch July 5, 2023 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants