Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom loaders for function notation #75

Merged
merged 25 commits into from
Apr 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions conda/tempo-examples.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,5 @@ dependencies:
- openshift
- pytest==6.2.0
- pytest-asyncio==0.14.0
- pandas==1.0.1
- numpyro==0.6.0
344 changes: 344 additions & 0 deletions docs/examples/custom-model/README.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,344 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Serving a Custom Model\n",
"\n",
"This example walks you through how to deploy a custom model with Tempo.\n",
"In particular, we will walk you through how to write custom logic to run inference on a [`numpyro` model](http://num.pyro.ai/en/stable/).\n",
"\n",
"Note that we've picked `numpyro` for this example simply because it's not supported out of the box, but it should be possible to adapt this example easily to any other custom model."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"This notebooks needs to be run in the `tempo-examples` conda environment defined below. Create from project root folder:\n",
"\n",
"```bash\n",
"conda env create --name tempo-examples --file conda/tempo-examples.yaml\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.core.magic import register_line_cell_magic\n",
"\n",
"@register_line_cell_magic\n",
"def writetemplate(line, cell):\n",
" with open(line, 'w') as f:\n",
" f.write(cell.format(**globals()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training\n",
"\n",
"The first step will be to train our model.\n",
"This will be a very simple bayesian regression model, based on an example provided in the [`numpyro` docs](https://nbviewer.jupyter.org/github/pyro-ppl/numpyro/blob/master/notebooks/source/bayesian_regression.ipynb).\n",
"\n",
"Since this is a probabilistic model, during training we will compute an approximation to the posterior distribution of our model using MCMC."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Original source code and more details can be found in:\n",
"# https://nbviewer.jupyter.org/github/pyro-ppl/numpyro/blob/master/notebooks/source/bayesian_regression.ipynb\n",
"\n",
"\n",
"import numpyro\n",
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"from numpyro import distributions as dist\n",
"from jax import random\n",
"from numpyro.infer import MCMC, NUTS\n",
"\n",
"DATASET_URL = 'https://raw.githubusercontent.com/rmcelreath/rethinking/master/data/WaffleDivorce.csv'\n",
"dset = pd.read_csv(DATASET_URL, sep=';')\n",
"\n",
"standardize = lambda x: (x - x.mean()) / x.std()\n",
"\n",
"dset['AgeScaled'] = dset.MedianAgeMarriage.pipe(standardize)\n",
"dset['MarriageScaled'] = dset.Marriage.pipe(standardize)\n",
"dset['DivorceScaled'] = dset.Divorce.pipe(standardize)\n",
"\n",
"def model_function(marriage : np.ndarray = None, age : np.ndarray = None, divorce : np.ndarray = None):\n",
" a = numpyro.sample('a', dist.Normal(0., 0.2))\n",
" M, A = 0., 0.\n",
" if marriage is not None:\n",
" bM = numpyro.sample('bM', dist.Normal(0., 0.5))\n",
" M = bM * marriage\n",
" if age is not None:\n",
" bA = numpyro.sample('bA', dist.Normal(0., 0.5))\n",
" A = bA * age\n",
" sigma = numpyro.sample('sigma', dist.Exponential(1.))\n",
" mu = a + M + A\n",
" numpyro.sample('obs', dist.Normal(mu, sigma), obs=divorce)\n",
"\n",
"# Start from this source of randomness. We will split keys for subsequent operations.\n",
"rng_key = random.PRNGKey(0)\n",
"rng_key, rng_key_ = random.split(rng_key)\n",
"\n",
"num_warmup, num_samples = 1000, 2000\n",
"\n",
"# Run NUTS.\n",
"kernel = NUTS(model_function)\n",
"mcmc = MCMC(kernel, num_warmup, num_samples)\n",
"mcmc.run(rng_key_, marriage=dset.MarriageScaled.values, divorce=dset.DivorceScaled.values)\n",
"mcmc.print_summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Saving trained model\n",
"\n",
"Now that we have _trained_ our model, the next step will be to save it so that it can be loaded afterwards at serving-time.\n",
"Note that, since this is a probabilistic model, we will only need to save the traces that approximate the posterior distribution over latent parameters.\n",
"\n",
"This will get saved in a `numpyro-divorce.json` file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"samples = mcmc.get_samples()\n",
"serialisable = {}\n",
"for k, v in samples.items():\n",
" serialisable[k] = np.asarray(v).tolist()\n",
" \n",
"model_file_name = \"./artifacts/numpyro-divorce.json\"\n",
"with open(model_file_name, 'w') as model_file:\n",
" json.dump(serialisable, model_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Serving\n",
"\n",
"The next step will be to serve our model through Tempo. \n",
"For that, we will implement a custom model to perform inference using our custom `numpyro` model.\n",
"Once our custom model is defined, we will be able to deploy it on any of the available runtimes using the same environment that we used for training."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Custom inference logic \n",
"\n",
"Our custom model will be responsible of:\n",
"\n",
"- Loading the model from the set samples we saved previously.\n",
"- Running inference using our model structure, and the posterior approximated from the samples.\n",
"\n",
"With Tempo, this can be achieved as:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import json\n",
"import numpy as np\n",
"\n",
"from numpyro.infer import Predictive\n",
"from numpyro import distributions as dist\n",
"from jax import random\n",
"from tempo import model, ModelFramework\n",
"\n",
"local_folder = os.path.join(os.getcwd(), \"artifacts\")\n",
"\n",
"@model(\n",
" name='numpyro-divorce',\n",
" platform=ModelFramework.Custom,\n",
" local_folder=local_folder,\n",
")\n",
"def numpyro_divorce(marriage: np.ndarray, age: np.ndarray) -> np.ndarray:\n",
" rng_key = random.PRNGKey(0)\n",
" predictions = numpyro_divorce.context.predictive_dist(\n",
" rng_key=rng_key,\n",
" marriage=marriage,\n",
" age=age\n",
" )\n",
" \n",
" mean = predictions['obs'].mean(axis=0)\n",
" return np.asarray(mean)\n",
"\n",
"@numpyro_divorce.loadmethod\n",
"def load_numpyro_divorce():\n",
" model_uri = os.path.join(\n",
" numpyro_divorce.details.local_folder,\n",
" \"numpyro-divorce.json\"\n",
" )\n",
" \n",
" with open(model_uri) as model_file:\n",
" raw_samples = json.load(model_file)\n",
"\n",
" samples = {}\n",
" for k, v in raw_samples.items():\n",
" samples[k] = np.array(v)\n",
"\n",
" numpyro_divorce.context.predictive_dist = Predictive(model_function, samples)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now test our custom logic by running inference locally."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"marriage = np.array([28.0])\n",
"age = np.array([63])\n",
"pred = numpyro_divorce(marriage=marriage, age=age)\n",
"\n",
"print(pred)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploying model\n",
"\n",
"Finally, we'll be able to deploy our model using Tempo against one of the available runtimes (i.e. Kubernetes, Docker or Seldon Deploy).\n",
"For this example, we will deploy the model using the Docker runtime."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"import os\n",
"PYTHON_VERSION = f\"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}\"\n",
"TEMPO_DIR = os.path.abspath(os.path.join(os.getcwd(), '..', '..', '..'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writetemplate ./artifacts/conda.yaml\n",
"name: tempo-numpyro\n",
"channels:\n",
" - defaults\n",
"dependencies:\n",
" - pip=21.0.1\n",
" - python=3.7.9\n",
" - pandas=1.0.1\n",
" - pip:\n",
" - mlops-tempo @ file://{TEMPO_DIR}\n",
" - numpyro==0.6.0\n",
" - mlserver==0.3.1.dev7"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from tempo.serve.loader import save\n",
"save(numpyro_divorce, save_env=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from tempo.seldon import SeldonDockerRuntime\n",
"\n",
"docker_runtime = SeldonDockerRuntime()\n",
"docker_runtime.deploy(numpyro_divorce)\n",
"docker_runtime.wait_ready(numpyro_divorce)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now test our model deployed in Docker as:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"numpyro_divorce.remote(marriage=marriage, age=age)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"docker_runtime.undeploy(numpyro_divorce)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading