Titanic

The classic binary classification, the Kaggle's 'hello-world' (https://www.kaggle.com/c/titanic).

You can find my step-by-step notes here. The project is split into two standalone jupyter-notebooks.

desc	link	notable
Examine the data. Build a robust and reproducible preprocessing pipe with feature transformation and selection. Visualize the data.	https://github.com/olszewskip/Titanic/blob/master/preprocess_visualize.ipynb	custom-Transformer-classes, ColumnTransformers, FeatureUnion, Pipeline; feature-importance-tests; PCA, TSNE
Fit and tune classifiers using grid-search with cross-validation. Report accuracy, confusion-matrix, ROC-curve. Select best based on independent validation score.	https://github.com/olszewskip/Titanic/blob/master/classify.ipynb	GridSearchCV, LogisticRegression, NaiveBayes, GradientBoosting, AdaBoosting, VotingClassifier

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
README.md		README.md
adaboost_clf_cv.pkl		adaboost_clf_cv.pkl
build_preprocess_pipe.py		build_preprocess_pipe.py
classify.ipynb		classify.ipynb
gradboost_clf_cv.pkl		gradboost_clf_cv.pkl
kneigh_clf_cv.pkl		kneigh_clf_cv.pkl
logreg_clf_cv.pkl		logreg_clf_cv.pkl
my_submission.csv		my_submission.csv
prepare_submission.ipynb		prepare_submission.ipynb
preprocess_visualize.ipynb		preprocess_visualize.ipynb
vote_clf.pkl		vote_clf.pkl

Provide feedback