Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AUC below 0.5 #60

Open
MLecardonnel opened this issue Jun 7, 2024 · 0 comments
Open

AUC below 0.5 #60

MLecardonnel opened this issue Jun 7, 2024 · 0 comments

Comments

@MLecardonnel
Copy link
Collaborator

MLecardonnel commented Jun 7, 2024

When using eurybia on small dataframes the computed AUC can be below 0.5, even if you compare the same dataframe in baseline and current. It is caused by the train test split on the concatenated data.
A solution could be to apply the train test split with the same seed on both baseline and current dataframes before concatenating them.
An other quick solution could be to duplicate the data to have enough data for a balanced train test split.

To reproduce:

from eurybia import SmartDrift
import pandas as pd

df = pd.DataFrame([[0,1],[0,1],[0,1],[0,2],[0,2],[0,2],[0,2]], columns=["A","B"])

sd = SmartDrift(
    df_current=df,
    df_baseline=df,
)

sd.compile()

sd.generate_report(
    output_file="auc_test.html",
    title_story="AUC Test",
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant