Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dowhy replication issue and output discrepancy with statsmodels #1227

Closed
dododobetter opened this issue Jul 16, 2024 · 2 comments
Closed

dowhy replication issue and output discrepancy with statsmodels #1227

dododobetter opened this issue Jul 16, 2024 · 2 comments
Labels
question Further information is requested stale

Comments

@dododobetter
Copy link

Ask your question

Hi all, I'm new to the GitHub community and Python. I have two questions regarding the dowhy package:

  1. Output Replicability Issue:
    The estimate from dowhy appears to vary slightly each time I run it. I've tried setting random seeds without success. How can I ensure that the output is consistent and replicable?

  2. Discrepancy with Statsmodels:
    I've noticed significant differences between the treatment effect estimates (ATE, ATT, ATC) obtained from dowhy and those generated by statsmodels. Both methods use the same identification approach (propensity score matching & probit model). Can anyone provide guidance on resolving this discrepancy?

Expected behavior

  1. DoWhy output should be replicable (i.e. exactly the same value every time)
  2. The output should be very similar to statsmodels/Stata/R outputs.

Version information:

  • DoWhy version [0.11.1]

Additional context

My codes below for reference:

# Dataset loading

cur_dir = os.path.abspath(os.path.dirname(res_st.__file__))
file_name = 'cataneo2.csv'
file_path = os.path.join(cur_dir, file_name)
dta_cat = pd.read_csv(file_path)
methods = ['ra', 'ipw', 'aipw', 'aipw_wls', 'ipw_ra']
methods_st = [
    ("ra", res_st.results_ra),
    ("ipw", res_st.results_ipw),
    ("aipw", res_st.results_aipw),
    ("aipw_wls", res_st.results_aipw_wls),
    ("ipw_ra", res_st.results_ipwra),
]
pd.set_option('display.width', 500)
dta_cat.head()

# Statsmodels approach

# Treatment selection model: probit model
formula = 'mbsmoke_ ~ mmarried_ + mage + mage2 + fbaby_ + medu'
res_probit = Probit.from_formula(formula, dta_cat).fit()  # Estimate the probability of smoking

# Outcome model: OLS model
formula_outcome = 'bweight ~ prenatal1_ + mmarried_ + mage + fbaby_'
mod = OLS.from_formula(formula_outcome, dta_cat)

# Treatment indicator variable
tind = np.asarray(dta_cat['mbsmoke_'])  # Converts the treatment indicator variable (mbsmoke_) from the DataFrame to a NumPy array.
teff = TreatmentEffect(mod, tind, results_select=res_probit)

res = teff.ipw()  # Compute POM and ATE using inverse probability weighting
print("Results from Statsmodels (ATE):", res)

teff.ipw(effect_group=1)  # Average Treatment Effect on Treated
teff.ipw(effect_group=0)  # ATE on untreated

# DoWhy approach

np.random.seed(42)

model = CausalModel(
    data=dta_cat,
    treatment='mbsmoke_',
    outcome='bweight',
    common_causes=['mmarried_', 'mage', 'mage2', 'fbaby_', 'medu', 'prenatal1_']
)

identified_estimand = model.identify_effect()
print("Identified Estimand from DoWhy:", identified_estimand)

ATE = model.estimate_effect(
    identified_estimand,
    method_name='backdoor.propensity_score_matching',
    method_params={
        'propensity_score_model': LogisticRegression(),  # Ensure it uses Logistic Regression
        'matching_algorithm': 'nearest_neighbor',  # Ensure similar matching algorithm
        'n_neighbors': 1  # Default is 1-to-1 matching, similar to pairwise matching
    }
)
print("ATE from DoWhy:", ATE.value)

ATT = model.estimate_effect(
    identified_estimand,
    method_name='backdoor.propensity_score_matching',
    method_params={
        'propensity_score_model': LogisticRegression(),
        'matching_algorithm': 'nearest_neighbor',
        'n_neighbors': 1
    },
    target_units='att'  # Focus on treated units
)
print("ATT from DoWhy:", ATT.value)

ATC = model.estimate_effect(
    identified_estimand,
    method_name='backdoor.propensity_score_matching',
    method_params={
        'propensity_score_model': LogisticRegression(),
        'matching_algorithm': 'nearest_neighbor',
        'n_neighbors': 1
    },
    target_units='atc'  # Focus on untreated units
)
print("ATU from DoWhy:", ATC.value)

refutation = model.refute_estimate(
    identified_estimand,
    ATE,
    method_name='placebo_treatment_refuter'
)
print("Refutation result from DoWhy:", refutation)
@dododobetter dododobetter added the question Further information is requested label Jul 16, 2024
Copy link

This issue is stale because it has been open for 14 days with no activity.

@github-actions github-actions bot added the stale label Jul 31, 2024
Copy link

github-actions bot commented Aug 7, 2024

This issue was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale
Projects
None yet
Development

No branches or pull requests

1 participant