GitHub - rpriam/book1: Book 2023 on Pytorch, Numpy, and SkLearn for deep learning of tabular data

"Linear and Deep Models Basics with Pytorch, Numpy, and Scikit-Learn"

Files for the computer book in deep learning with statistical background

Amazon kdp paper back - 2023 - ISBN-13 :979-8371441577

Main Document (PDF EBOOK) with 247 pages (27-12-2022) for direct download book_pytorch_scikit_learn_numpy.pdf

Available Files in this repository

Datasets, main file .py and notebooks .pynb at ./notebooks

Main Features

Theory for the linear models and implementation with pytorch and scikit-learn
Practice of deep learning with pytorch for feedforward neural networks
Many examples and exercices to practice and understand further the contents
Very large datasets 450000 and 11000000 on a home computer with a few gigabytes
Step by step for theory & code (require only minimum knowledge in python and maths)
Learn the maths basics without compromise before consolidate towards advanced models
Generic python functions, allow to train and alter deep models for tabular data in a blink

Abstract

This book is an introduction to computational statistics for the generalized linear models (glm) and to machine learning with the python language. Extensions of the glm with nonlinearities come from hidden layer(s) within a neural network for linear and nonlinear regression or classification. This allows to present side by side classical statistics and current deep learning. The loglikelihoods and the corresponding loss functions are explained. The gradient and hessian matrix are discussed and implemented for these linear and nonlinear models. Several methods are implemented from scratch with numpy for prediction (linear, logistic, poisson regressions) and for reduction (principal component analysis, random projection). The gradient descent, newton-raphson, natural gradient and l-fbgs algorithms are implemented. The datasets in stake are with 10 to 10^7 rows, and are tabular such that images or texts are vectorized. The data are stored in a compressed format (memmap or hdf5) and loaded by chunks for several case studies with pytorch or scikit-learn. Pytorch is presented for training with minibatches via a generic implementation for studying with computer programs. Scikit-learn is presented for processing large datasets via the partial fit, after the small examples. Sixty exercises are proposed at the end of the chapters with selected solutions to go beyond the contents.

Chapters

Introduction

Polynomial regression
Error on a train sample
Error on a test sample
Linear models with numpy and scikit-learn (chapter02_book.ipynb)

Theory for linear regression
Theory for logistic regression
Loglikelihood and loss function
Analytical expression of the derivatives
implementation with numpy
Implementation with Scikit-Learn
First-order training of linear models (chapter03_book.ipynb)

Algorithm with one datum and with one minibatch
Implementation of the algorithms with numpy
Implementation of the algorithms with pytorch
Neural networks for (deep) glm (chapter04_book.ipynb)

Presentation of the different loss functions from pytorch
Generic implementation of the algorithms with pytorch
Example of nonlinear frontier with a small dataset
Lasso selection for (deep) glm (chapter05_book.ipynb)

Penalization of the regression for sparse solution
Implementation with pytorch for a neural network
Selection of the hyperparameters (grid and bayesian)
Hessian and covariance for (deep) glm (chapter06_book.ipynb)

Notion of variance of the parameters
Implementation with statsmodels for linear models
Implementation with pytorch for a neural network
Second-order training of (deep) glm (chapter07_book.ipynb)

Expression of the update for 1st-order for poisson regression
Expression of the update for 2nd-order for poisson regression
Implementation of gradient descent for the poisson regression
Implementation of newton-raphson and natural gradient with numpy
Implementation of l-fbgs algorithm with pytorch for deep regressions
Notion of quality of the estimation for comparison
Autoencoder compared to ipca and t-sne (chapter08_book.ipynb)

Introduction to the algebra for principal component analysis
Implementation step by step for principal component analysis
Implementation with scikit-Learn of pca and (non)linear autoencoders
Implementation of t-sne with python from two modules
Implementation of random projection for large datasets
Notion of quality of the visualization for comparison
Solution to selected exercices (chapter09_book.ipynb)

Several solutions for large datasets with scikit-learn
Several solutions for neural networks with pytorch

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
complements		complements
notebooks		notebooks
text		text
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

"Linear and Deep Models Basics with Pytorch, Numpy, and Scikit-Learn"

Files for the computer book in deep learning with statistical background

Amazon kdp paper back - 2023 - ISBN-13 :979-8371441577

About

Releases

Packages

Languages

rpriam/book1

Folders and files

Latest commit

History

Repository files navigation

"Linear and Deep Models Basics with Pytorch, Numpy, and Scikit-Learn"

Files for the computer book in deep learning with statistical background

Amazon kdp paper back - 2023 - ISBN-13 :979-8371441577

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages