Skip to content
This repository has been archived by the owner on Jun 22, 2022. It is now read-only.

Stacking by feature diversity and model diversity

Jakub edited this page Aug 30, 2018 · 13 revisions

Four_leaf_clover πŸ€

πŸ€ code

We added model and feature diversity and used stacking to combine the results.

Validation

Preprocessing

Application data

Bureau data

Credit Card data

Feature Extraction

We refactored the feature engineering so that it extracts all the features from train/valid/test in one go and later the features are divided by idx.

Application data -> eda-application.ipynb πŸ“

Installment Payments data -> eda-installments.ipynb πŸ“

POS Cash Balance application data -> eda-pos_cash_balance.ipynb πŸ“

Model

Then we used stacking on all the out of fold predictions we had:

Pipeline diagram

Since the diagram below is quite wide (it uses multiple input files), here is a link to the larger version.

[![HC-solution-6]larger version](https://gist.githubusercontent.com/jakubczakon/cac72983726a970690ba7c33708e100b/raw/38119684e5ea88178df084b5668aa1ccd8226408/home_credit_solution_6_pipeline.png)