Skip to content

use numpy.errstate in transform_event_rate_to_woe #353

@josp70

Description

@josp70

The log computation below in transform_event_rate_to_woe might result in RuntimeWarning: invalid value encountered in log

np.log((1. / event_rate - 1) * n_event / n_nonevent)

This normally happens when probabilities are outside the range (0, 1). As this is acceptable we propose a change to use the context manager for floating-point error handling:

def transform_event_rate_to_woe(event_rate, n_nonevent, n_event):
    """Transform event rate to WoE.

    Parameters
    ----------
    event_rate : array-like or float
        Event rate.

    n_nonevent : int
        Total number of non-events.

    n_event : int
        Total number of events.

    Returns
    -------
    woe : numpy.ndarray or float
        Weight of evidence.
    """
    with np.errstate(invalid='ignore'):
        return np.log((1. / event_rate - 1) * n_event / n_nonevent)

Below is a code to reproduce it using titanic.csv dataset

import pandas as pd
from optbinning import BinningProcess

data = pd.read_csv("titanic.csv")

error_in = ['Cabin']
X = data[error_in]
y = data["Survived"]

binner = BinningProcess(variable_names=error_in, verbose=True)

X_tr = binner.fit_transform(X, y, check_input=True)

print(X_tr)

print(binner._binned_variables["Cabin"].binning_table.build())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions