Skip to content

Latest commit

 

History

History
98 lines (76 loc) · 1.73 KB

yet_another_outline.md

File metadata and controls

98 lines (76 loc) · 1.73 KB

Outline

Intro

LM

  • Core ideas
  • Prediction
  • Basic Interpretation
  • Adding Complexity
  • Assumptions of a linear (and other) model(s)

Question- how to fit classification here without spilling into other stuff?

Model Criticism

  • General fit
  • Model Comparison
  • Model Selection
  • Model debugging

Misc:

  • Model transparency (e.g. model cards)
  • Model fairness

Model Estimation

  • OLS, MLE
  • Classification
  • Penalized
  • Optimization, SGD
  • Bayesian

LM Extensions

  • GLM, GAM, Mixed Models
  • Optim/Linear Programming/Hungarian Algorithm?
  • Latent Linear Models
  • Mixture models/Clustering

Machine learning

  • CV, metrics
  • Lasso, Ridge, Elastic Net
  • Trees
    • RF
    • GBM
  • DL
    • NN
    • Autoencoders
  • Reinforcement learning
  • Ensemble models

Uncertainty

  • Bayesian inference
  • Bootstrap
  • Conformal Predictions

Data Stuff

  • Feature and Target Transformations
  • missing data
    • recommend assessment of predicted data similarity to observed data
  • data quality and reliability/measurement
  • Sparsity
  • Outliers
  • Imbalanced data
  • 'Big' Data, Scalability
  • Data types
    • Categorical
    • Ordinal
    • Continuous
    • Time series
    • Text
    • Images
    • Audio
    • Video
    • Geospatial
    • etc.
  • Feature Engineering/Pre-processing/Categorical Embeddings/Dimensionality Reduction/Feature Selection/Feature Extraction
  • misc feature types: ordinal, zero-infated, etc.
  • Transformations: std, log, max
  • Data leakage
  • Data drift
  • Data bias (lack of representativeness), vs. statistical bias
  • Misc:
    • Data privacy, security, ethics
    • Data provenance, governance

Other Stuff

  • Causality
    • Causal inference
    • Techniques: experimental design, matching, meta-learners, uplift modeling, etc.