Thursday, May 18, 9 a.m.–12:20 p.m.
Torsten Scholak, Diego Maniloff
There has been uprising of probabilistic programming and Bayesian statistics. These techniques are tremendously useful, because they help us to understand, to explain, and to predict data through building a model that accounts for the data and is capable of synthesizing it. This is called the generative approach to statistical pattern recognition.
Estimating the parameters of Bayesian models has always been hard, impossibly hard actually in many cases for anyone but experts. However, recent advances in probabilistic programming have endowed us with tools to estimate models with a lot of parameters and for a lot of data. In this tutorial, we will discuss one of these tools, Edward. Edward is a black box tool, a Swiss army knife for Bayesian modeling that does not require knowledge in calculus or numerical integration. This puts the power of Bayesian statistics into the hands of everyone, not only experts of the field. And, it's great that Edward is implemented in Python with its rich, beginner-friendly ecosystem. It means we can immediately start playing with it.
-
Introduction
-
Coin Flip
-
A/B/... Testing
-
Bayesian Curve Fitting
-
Fitting Categorical Data
-
The Bayesian Netflix Problem
This tutorial will be based on Python 2, though thanks to @magsol we know that it works with Python 3, too.
-
Leave your current virtualenv if applicable:
$ deactivate
-
Create a new virtualenv called
pycon
(you can change the name and location of course):$ mkdir -p ~/VirtualEnv/pycon $ virtualenv -p $(which python2.7) ~/VirtualEnv/pycon $ source ~/VirtualEnv/pycon/bin/activate
-
Install TensorFlow, Edward, and Seaborn to the virtualenv
pycon
:(pycon) $ pip install --upgrade tensorflow (pycon) $ pip install -e "git+https://github.com/blei-lab/edward.git#egg=edward" (pycon) $ pip install seaborn
If you have a CUDA-capable NVIDIA graphics chip in your laptop, you may want to follow special installation instructions available on the TensorFlow web page. If you don't, everything will work just fine as well.
-
Install Jupyter Notebook into the
pycon
virtual environment:(pycon) $ pip install jupyter (pycon) $ ipython kernel install --user --name python2_pycon --display-name "Python 2 (PyCon)"
-
Start the Jupyter Notebook:
(pycon) $ jupyter notebook --no-browser
-
Clone this repository.
(pycon) $ git clone [email protected]:UnataInc/PyCon2017.git
-
Open the notebook,
tutorial.ipynb
, in Jupyter and make sure the kernel is set to "Python 2 (PyCon)". -
(Optional) In the notebook, there is the option to render a video. For this to work properly, you need FFmpeg and one out of two video encoding libraries:
- H.264. Please follow the installation instructions here or use a packaging tool (e.g., just
sudo apt-get install ffmpeg
on Ubuntu Linux orbrew install ffmpeg
on macOS). - Webm. Please follow the installation instructions here or use a packaging tool (e.g.,
brew install ffmpeg --with-libvpx
on macOS).
- H.264. Please follow the installation instructions here or use a packaging tool (e.g., just