AUC below 0.5 #60

MLecardonnel · 2024-06-07T14:38:14Z

When using eurybia on small dataframes the computed AUC can be below 0.5, even if you compare the same dataframe in baseline and current. It is caused by the train test split on the concatenated data.
A solution could be to apply the train test split with the same seed on both baseline and current dataframes before concatenating them.
An other quick solution could be to duplicate the data to have enough data for a balanced train test split.

To reproduce:

from eurybia import SmartDrift
import pandas as pd

df = pd.DataFrame([[0,1],[0,1],[0,1],[0,2],[0,2],[0,2],[0,2]], columns=["A","B"])

sd = SmartDrift(
    df_current=df,
    df_baseline=df,
)

sd.compile()

sd.generate_report(
    output_file="auc_test.html",
    title_story="AUC Test",
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AUC below 0.5 #60

AUC below 0.5 #60

MLecardonnel commented Jun 7, 2024 •

edited

Loading

AUC below 0.5 #60

AUC below 0.5 #60

Comments

MLecardonnel commented Jun 7, 2024 • edited Loading

MLecardonnel commented Jun 7, 2024 •

edited

Loading