This is my personal website that contains notes on mathematics, science and engineering for data scientists.
Install Ruby.
Install Ruby gems:
bundle install
Run website locally:
bundle exec jekyll serve
- Logic
- Logical statements
- Neccesity, sufficiency and equivalence
- Negation, conjunction and disjunction
- Sets
- Membership, equality and subsets
- Predicates, quantifiers and specification
- Set operations
- Cardinality, finiteness and countability
- Functions
- Functions and composition
- Injection, surjection and bijection
- Images and inverses
- Limits
- Absolute value, distance and closeness
- Sequences
- Boundedness and monotonicity
- Convergence and limits
- Derivatives
- Continuity and differentiability
- Univariate differential calculus
- Multivariate differential calculus
- Integrals
- Partitions and integration
- Fundamental theorems of calculus
- Univariate integral calculus
- Multivariate integral calculus
- Topological spaces
- Topologies and neighbourhoods
- Relative topologies and subspaces
- Closure, continuity and compactness
- Metric spaces
- Metrics
- Balls and points
- Closure, open and closed sets
- Boundedness and continuity
- Connectedness and equivalance
- Vectors
- Vector spaces
- Linear independence
- Span, basis and dimension
- Inner product spaces
- Positive definiteness
- Length and angles
- Orthogonality
- Matrices
- Matrix algebra
- Determinants
- Invertibility
- Special matrices
- Eigenvalues and eigenvectors
- Eigen-decomposition
- Characteristic polynomial
- Diagonalisation
- Matrix decompositions
- Cholesky decomposition
- QR decomposition
- Singular value decomposition
- Matrix applications
- Systems of linear equations
- Projection
- Linear transformations
- Probability spaces
- Sample spaces and events
- Sigma algebras
- Probability measures
- Probability axioms
- Probability forms
- Marginal probability
- Joint probability
- Conditional probability
- Random variables
- Pushforward probability measure
- Support
- Discrete random variables
- Continuous random variables
- Independent random variables
- Transformations of random variables
- Moments
- Expectation
- Variance
- Higher-order moments
- Covariance and correlation
- Probability generating functions
- Moment generating functions
- Characteristic functions
- Distributions
- Discrete uniform
- Bernoulli
- Binomial
- Geometric
- Poisson
- Continuous uniform
- Exponential
- Gamma
- Beta
- Chi-squared
- Normal
- Dirichlet
- Multivariate distributions
- Concentration bounds
- Univariate bounds
- Bounds of expectations
- Bounds of sums
- Bounds of functions
- Probabilistic convergence
- Pointwise and uniform convergence
- Convergence in distribution
- Convergence in probability
- Almost-sure convergence
- Delta method
- Limit theorems
- Laws of large numbers
- Central limit theorem
- Order statistics
- Stochastic processes
- Covariance and correlation
- Stationarity
- Gaussian processes
- Wiener process
- Renewal processes
- Markov processes
- Samples
- Summary statistics
- Sufficiency
- Estimators
- Biasness
- Consistency
- Efficiency
- Point estimation
- Likelihood function
- Method of least squares
- Method of moments
- Maximum likelihood estimation
- Interval estimation
- Interval estimators
- Coverage probability
- Confidence intervals
- Hypothesis testing
- Null and alternative hypotheses
- Test statistic and critical value
- Significance level and p-values
- Power and sample size
- Types of hypothesis tests
- Simple and composite hypotheses
- Multiple testing
- Convexity
- Convex sets and functions
- Determination of convexity
- Implications of convexity
- Unconstrained optimisation
- Gradient descent methods
- Newton's method
- Constrained optimisation
- Substitution method
- Lagrangian multipliers
- Simplex method
- Interior point methods
- Linear models
- Linear regression
- Logistic regression
- Multinomial regression
- Generalised linear models
- Non-linear models
- Polynomial regression
- Spline regression
- Fourier and wavelet bases
- Generalised additive models
- Kernel models
- Maximal margin classifier
- Support vector classifier
- Support vector machines
- Tree models and ensembling
- Classification and regression trees
- Bagging
- Random forests
- Boosting
- Voting and stacked ensembles
- Resampling methods
- Cross-validation
- Bootstrap methods
- Feature selection
- Exhaustive search
- Subset selection
- Recursive feature elimination
- Regularisation penalties
- Model selection
- Adjusted R-squared
- Mallows's Cp
- AIC and BIC
- Loss functions
- Validation metrics
- Dimension reduction
- Principal components analysis
- Canonical correlation analysis
- Factor analysis
- Independent components analysis
- Manifold learning
- Multi-dimensional scaling
- Isometric feature mapping
- Local linear embeddings
- Stochastic neighbourhood embeddings and t-SNE
- Spectral embeddings
- Density estimation
- Gaussian mixture models
- Expectation-maximisation algorithm
- Histogram estimators
- Kernel density estimators
- Clustering
- K-means clustering
- K-medoids and PAM
- Affinity propagation
- Spectral clustering
- Agglomerative hierarchical clustering
- DBSCAN
- Biclustering
- Cluster evaluation
- Novelty and outlier detection
- One-class support vector machine
- Elliptic envelope
- Isolation forest
- Local outlier factor
- Association rule learning
- Apriori algorithm
- ECLAT algorithm
- FP-growth algorithm
- Covariance estimation
- Empirical covariance
- Shrinkage methods
- Minimum covariance determinant
- Subjective probability
- Subjective uncertainty
- Standard events
- Conditional probability
- Decisions
- Utility
- Estimation and prediction
- Prior and likelihood representation
- Exchangeability
- De Finetti's representation theorem
- Priors
- Asymptotics
- Parametric modelling
- Conjugate models
- Exponential families
- Non-conjugate families
- Posterior summaries
- Computational inference
- Intractable integrals
- Monte Carlo estimation
- Markov chain Monte Carlo (MCMC)
- Hamiltonian Markov chain Monte Carlo
- Analytic approximations
- Model choice
- Model uncertainty
- Model averaging
- Model selection
- Posterior predictive checking
- Linear models
- Conjugate prior
- Reference prior
- General basis functions
- Generalised linear models
- Non-parameteric models
- Random probability measures
- Dirichlet processes
- Pólya Trees
- Partition models
- Gaussian processes
- Spline models
- Partition regression models
- Mixture models
- Finite mixture models
- Dirichlet process mixture models
- Mixed-membership models
- Latent factor models
- Graphical models
- Belief networks
- Markov networks
- Factor graphs
- Activation functions
- Linear
- Sigmoid
- RELU
- ELU
- Softmax
- Loss functions
- Mean squared error
- Cross-entropy loss
- Cosine similarity
- KL divergence
- Optimisers
- Stochastic gradient descent
- Momentum
- Nesterov momentum
- Adagrad
- RMSprop
- Adam
- Initialisers
- Glorot/Xavier initialisation
- Orthogonal initialisation
- Regularisation
- Weight sharing
- Dropout
- Weight regularisation
- Early stopping/patience
- Normalisation
- Batch normalisation
- Layer normalisation
- Feed-forward networks
- Neuron unit
- Multi-layer perceptron
- Convolutional networks
- Convolution operation
- Pooling
- Padding and stride
- Transposed convolutions
- Recurrent networks
- RNN cell
- Stacked and bi-directional RNNs
- Back-propagation through time
- Long short term memory cell
- Transformer networks
- Multi-head attention
- Positional encoding
- Transformer architecture
- Normalising flows
- Bijectors
- Autoregressive flows
- Masked autoregressive flow (MAF)
- Masked autoencoder for distribution estimation (MADE)
- Inverse autoregressive flow (IAF)
- Non-linear independent components estimation (NICE)
- RealNVP model
- Glow model
- Auto-encoder networks
- Bottleneck architecture
- Evidence lower bound (ELBO)
- Reparameterisation trick
- Variational auto-encoder
- Bayesian networks
- Aleatoric and epistemic uncertainty
- Bayes by backpropogation
- MC dropout
- Uncertainty estimation
- Multi-armed bandits
- Optimal arm identification
- Regret minimisation
- Optimistic approaches
- Thompson sampling
- Contextual bandits
- Bayesian optimisation
- Reinforcement learning
- Markov decision processes
- Dynamic programming
- Q-learning
- Importance sampling
- Linear function approximation
- Deep Q-learning
- Policy gradient methods
- Data types
- Data summaries
- Data transformations
- Data quality
- Data visualisation
- Temporal data
- Spatial data
- Multivariate data
- Networks
- Vertices and edges
- Adjacency and direction
- Neighbourhoods
- Paths and cycles
- Cliques and separation
- Graph data structures
- Graph embeddings
- Search algorithms
- Pathfinding algorithms
- Minimum weight spanning tree
- Community detection
- Graph classification
- Images
- Image histograms
- Affine transformations
- Denoising
- Edge detection
- Feature detection
- Image segmentation
- Pose estimation
- Motion estimation
- Stereo correspondence
- Object recognition
- Language
- Word classification
- Word embeddings
- Text clustering
- Sequence modelling
- Optimisation
- Gradient descent methods
- Distributed, stochastic gradient descent
- Stochastic variational inference
- Markov chain Monte Carlo
- Divide and conquer methods
- Subsampling methods
- Streaming
- Parameter estimation
- Forgetting factors
- Change point detection