model_builder adaptations for full sklearn compat #189

michaelraczycki · 2023-06-06T15:09:44Z

Adaptations for full sklearn compatibility:
All functions expect data to be passed around in X,y format known from sklearn
output_var now is a property, which allows to use it in loading functions (still save saves entire original dataset - concatenated X and y, so this was needed to be able to always tell which column is a target column)
fit now requires X to be pd.DataFrame and y to be pd.Series

michaelraczycki · 2023-06-06T15:12:43Z

also in order to simplify the code, now the model_config and sampler_config can be passed only at the class initialization, other approaches where they could be passed to build model required constant checks and was polluting the code, without clear benefits.

pymc_experimental/preprocessing/standard_scaler.py

adapting linearmodel and it's tests to use sklearn-only approach

twiecki · 2023-06-07T14:04:57Z

pymc_experimental/model_builder.py

-        X: Union[np.ndarray, pd.DataFrame, pd.Series],
-        y: Union[np.ndarray, pd.Series],
+        X: pd.DataFrame,
+        y: pd.Series,


Can we make y be optional.

sure, it will introduce some complications in child classes, as then we're technically allowing to build with y being None, and DelayedSaturatedMMM is build assuming that it's there. In general it will add probably quite few if's in the code but it shouldn't be a big deal

The issue is that if we don't do that it will limit the models we can fit this on, as not all actually have an X and a y, for example, the change-point model only has a single time-series.

…icient

michaelraczycki requested review from twiecki and ricardoV94 June 6, 2023 15:09

twiecki reviewed Jun 7, 2023

View reviewed changes

pymc_experimental/preprocessing/standard_scaler.py Show resolved Hide resolved

michaelraczycki added 3 commits June 7, 2023 15:55

model_builder adaptations for full sklearn compatibility

028027a

adapting linearmodel and it's tests to use sklearn-only approach

replacing standard_scaler_df with sklearn_config

c44772d

adapting linearmodel and it's tests to use sklearn-only approach

2cfa0cb

michaelraczycki force-pushed the sklearn_adaptations branch from 08a75fc to 2cfa0cb Compare June 7, 2023 13:57

michaelraczycki requested a review from twiecki June 7, 2023 13:57

twiecki reviewed Jun 7, 2023

View reviewed changes

making y optional, adding fitted model fixture to make tests more eff…

900a764

…icient

michaelraczycki requested a review from twiecki June 8, 2023 08:58

michaelraczycki merged commit a7c4d79 into pymc-devs:main Jun 8, 2023

michaelraczycki deleted the sklearn_adaptations branch July 5, 2023 09:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model_builder adaptations for full sklearn compat #189

model_builder adaptations for full sklearn compat #189

michaelraczycki commented Jun 6, 2023

michaelraczycki commented Jun 6, 2023

twiecki Jun 7, 2023

michaelraczycki Jun 7, 2023

twiecki Jun 7, 2023

model_builder adaptations for full sklearn compat #189

model_builder adaptations for full sklearn compat #189

Conversation

michaelraczycki commented Jun 6, 2023

michaelraczycki commented Jun 6, 2023

twiecki Jun 7, 2023

Choose a reason for hiding this comment

michaelraczycki Jun 7, 2023

Choose a reason for hiding this comment

twiecki Jun 7, 2023

Choose a reason for hiding this comment