Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user docs for Turing/Turing SDK #174

Merged
merged 13 commits into from
Mar 17, 2022
Binary file added docs/.gitbook/assets/pyfunc_ensembler_config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 23 additions & 0 deletions docs/how-to/create-a-router/configure-ensembler.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,29 @@ Configure the resources required for the ensembler. There are 3 required inputs,

**Min/Max Replicas**: Min/max number of replicas for your ensembler. Scaling of the ensembler based on traffic volume will be automatically done for you.

## Pyfunc Ensembler
Turing will deploy a previously registered pyfunc ensembler (refer to
[the samples](https://github.com/gojek/turing/tree/main/sdk/samples) in the SDK section for more information on how to
deploy one) as a containerised web service.

This allows you to simply define the logic required for the ensembling
step by implementing a Python `mlflow`-based interface, and rely on Turing API to containerise and package your
implementation as an entire web service automatically.

To configure your router with a Pyfunc ensembler, simply select from the drop down list your desired ensembler,
registered in your current project. You'll also need to indicate your desired timeout value and resource request values:

![](../../.gitbook/assets/pyfunc_ensembler_config.png)

**Pyfunc Ensembler**: The name of the pyfunc ensembler that has been deployed in your *current* project

**Timeout**: Request timeout, which when exceeded, the request to the ensembler will be terminated

**CPU**: Total amount of CPU available for your ensembler.

**Memory**: Total amount of RAM available for your ensembler.

**Min/Max Replicas**: Min/max number of replicas for your ensembler. Scaling of the ensembler based on traffic volume will be automatically done for you.

## External Ensembler
Coming Soon.
Expand Down
29 changes: 29 additions & 0 deletions sdk/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Introduction
The Turing SDK is a Python tool for interacting with the Turing API, and complements the existing Turing UI available
for managing router creation, deployment, versioning, etc.

It not only allows you to build your routers in an incremental and configurable manner, it
also gives you the opportunity to write imperative scripts to automate various router modification and deployment
processes, hence simplifying your workflow when interacting with Turing API.

## What is the Turing SDK?
The Turing SDK is entirely written in Python and acts as a wrapper, around the classes automatically generated (by
[OpenAPI Generator](https://github.com/OpenAPITools/openapi-generator)) from the OpenAPI specs written for the Turing
API. These generated classes in turn act as an intermediary between raw JSON objects that are passed in HTTP
requests/responses made to/received from the Turing API.

![Turing SDK Classes](./assets/turing-sdk-classes.png)
deadlycoconuts marked this conversation as resolved.
Show resolved Hide resolved
deadlycoconuts marked this conversation as resolved.
Show resolved Hide resolved

If you're someone who has used Turing/the Turing UI and would like more control and power over router
management, the Turing SDK fits perfectly for your needs.

Note that using the Turing SDK assumes that you have basic knowledge of what Turing does and how Turing routers
operate. If you are unsure of these, refer to the Turing UI [docs](https://github.com/gojek/turing/tree/main/docs/how-to) and
familiarise yourself with them first. A list of useful and important concepts used in Turing can also be found
[here](https://github.com/gojek/turing/blob/main/docs/concepts.md).

Note that some functionalities available with the UI are not available with the Turing SDK, e.g. creating new projects.

## Samples
Samples of how the Turing SDK can be used to manage routers can be found
[here](https://github.com/gojek/turing/tree/main/sdk/samples).
Binary file added sdk/docs/assets/turing-sdk-classes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions sdk/samples/router/create_from_existing_router.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,9 @@ def main(turing_api: str, project: str):
router_1 = turing.Router.get(router.id)

# Now we'd like to create a new router that's similar to router_1, but with some configs modified
# Reminder: When trying to replicate configuration from an existing router, always retrieve the underlying
# `RouterConfig` from the `Router` instance by accessing its `config` attribute.

# Get the router config from router_1
router_config = router_1.config

Expand Down
155 changes: 155 additions & 0 deletions sdk/samples/router/create_router_with_pyfunc_ensembler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
import turing
import turing.batch
import turing.batch.config
import turing.router.config.router_config
from turing.router.config.route import Route
from turing.router.config.router_config import RouterConfig
from turing.router.config.router_version import RouterStatus
from turing.router.config.resource_request import ResourceRequest
from turing.router.config.log_config import LogConfig, ResultLoggerType
from turing.router.config.router_ensembler_config import PyfuncRouterEnsemblerConfig
from turing.router.config.experiment_config import ExperimentConfig

from typing import List, Any


# To register a pyfunc ensembler to be used in a Turing router, implement the `turing.ensembler.PyFunc` interface
class SampleEnsembler(turing.ensembler.PyFunc):
"""
A simple ensembler, that returns the value corresponding to the version that has been specified in the
`features` in each request. This value if obtained from the route responses found in the `predictions` in each
request.

If no version is specified in `features`, return the sum of all the values of all the route responses in
`predictions` instead.

e.g. The values in the route responses (`predictions`) corresponding to the versions, `a`, `b` and `c` are 1, 2
and 3 respectively.

For a given request, if the version specified in `features` is "a", the ensembler would return the value 1.

If no version is specified in `features`, the ensembler would return the value 6 (1 + 2 + 3).
"""
# `initialize` is essentially a method that gets called when an object of your implemented class gets instantiated
def initialize(self, artifacts: dict):
pass

# Each time a Turing Router sends a request to a pyfunc ensembler, ensemble will be called, with the request payload
# being passed as the `features` argument, and the route responses as the `predictions` argument.
#
# If an experiment has been set up, the experiment returned would also be passed as the `treatment_config` argument.
#
# The return value of `ensemble` will then be returned as a `json` payload to the Turing router.
def ensemble(
self,
features: dict,
predictions: List[dict],
treatment_config: dict) -> Any:
# Get a mapping between route names and their corresponding responses
routes_to_response = dict()
for prediction in predictions:
routes_to_response[prediction["route"]] = prediction

if "version" in features:
return routes_to_response[features["version"]]["data"]["value"]
else:
return sum(response["data"]["value"] for response in routes_to_response.values())


def main(turing_api: str, project: str):
# Initialize Turing client
turing.set_url(turing_api)
turing.set_project(project)

# Register an ensembler with Turing:
ensembler = turing.PyFuncEnsembler.create(
name="sample-ensembler-1",
ensembler_instance=SampleEnsembler(),
conda_env={
'dependencies': [
'python>=3.7.0',
# other dependencies, if required
]
}
)
print("Ensembler created:\n", ensembler)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new file to demonstrate the deployment of a router with a pyfunc ensembler.


# Build a router config in order to create a router
# Create some routes
routes = [
Route(
id='control',
endpoint='http://control.endpoints/predict',
timeout='20ms'
),
Route(
id='experiment-a',
endpoint='http://experiment-a.endpoints/predict',
timeout='20ms'
)
]

# Create an experiment config (
experiment_config = ExperimentConfig(
type="nop"
)

# Create a resource request config for the router
resource_request = ResourceRequest(
min_replica=0,
max_replica=2,
cpu_request="500m",
memory_request="512Mi"
)

# Create a log config for the router
log_config = LogConfig(
result_logger_type=ResultLoggerType.NOP
)

# Create an ensembler for the router
ensembler_config = PyfuncRouterEnsemblerConfig(
project_id=1,
ensembler_id=1,
resource_request=ResourceRequest(
min_replica=0,
max_replica=2,
cpu_request="500m",
memory_request="512Mi"
),
timeout="60ms",
)

# Create the RouterConfig instance
router_config = RouterConfig(
environment_name="id-dev",
name="router-with-pyfunc-ensembler",
routes=routes,
rules=[],
default_route_id="test",
experiment_engine=experiment_config,
resource_request=resource_request,
timeout="100ms",
log_config=log_config,
ensembler=ensembler_config
)

# Create a new router using the RouterConfig object
new_router = turing.Router.create(router_config)
print(f"You have created a router with id: {new_router.id}")

# Wait for the router to get deployed
try:
new_router.wait_for_status(RouterStatus.DEPLOYED)
except TimeoutError:
raise Exception(f"Turing API is taking too long for router {new_router.id} to get deployed.")

# 2. List all routers
routers = turing.Router.list()
for r in routers:
print(r)


if __name__ == '__main__':
import fire
fire.Fire(main)
45 changes: 44 additions & 1 deletion sdk/samples/router/general.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ def main(turing_api: str, project: str):
turing.set_project(project)

# Build a router config in order to create a router
# Note: When constructing a `RouterConfig` object from scratch, it is **highly recommended** that you construct each
# individual component using the Turing SDK classes provided instead of using `dict` objects which do not perform
# any schema validation.

# Create some routes
routes = [
Route(
Expand Down Expand Up @@ -50,6 +54,12 @@ def main(turing_api: str, project: str):
]

# Create some traffic rules
# Note: Each traffic rule is defined by at least one `TrafficRuleCondition` and one route. Routes are essentially
# the `id`s of `Route` objects that you intend to specify for the entire `TrafficRule`.
#
# When defining a traffic rule, one would need to decide between using a `HeaderTrafficRuleCondition` or a
# `PayloadTrafficRuleCondition`. These subclasses can be used to build a `TrafficRuleCondition` without having to
# manually set attributes such as `field_source` or `operator`.
rules = [
TrafficRule(
conditions=[
Expand Down Expand Up @@ -112,7 +122,15 @@ def main(turing_api: str, project: str):
)
]

# Create an experiment config (
# Create an experiment config
# The `ExperimentConfig` class is a simple container to carry configuration related to an experiment to be used by a
# Turing Router. Note that as Turing does not create experiments automatically, you would need to create your
# experiments separately prior to specifying their configuration here.
#
# Also, notice that `ExperimentConfig` does not contain any fixed schema as it simply carries configuration for
# generic experiment engines, which are used as plug-ins for Turing. When building an `ExperimentConfig` from
# scratch, you would need to consider the underlying schema for the `config` attribute as well as the appropriate
# `type` that corresponds to your selected experiment engine.
experiment_config = ExperimentConfig(
type="test-exp",
config={
Expand All @@ -128,6 +146,9 @@ def main(turing_api: str, project: str):
)

# Create a resource request config for the router
# Note: The units for CPU and memory requests are measured in cpu units and bytes respectively. You may wish to
# read more about how these are measured here:
# https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/.
resource_request = ResourceRequest(
min_replica=0,
max_replica=2,
Expand All @@ -136,6 +157,12 @@ def main(turing_api: str, project: str):
)

# Create a log config for the router
# Note: Logging for Turing Routers is done through BigQuery or Kafka, and its configuration is managed by the
# `LogConfig` class. Two helper classes (child classes of `LogConfig`) have been created to assist you in
# constructing these objects - `BigQueryLogConfig` and `KafkaLogConfig`.
#
# If you do not intend to use any logging, simply create a regular `LogConfig` object with `result_loggger_type` set
# as `ResultLoggerType.NOP`, without defining the other arguments.
log_config = LogConfig(
result_logger_type=ResultLoggerType.NOP
)
Expand All @@ -161,6 +188,10 @@ def main(turing_api: str, project: str):
)

# Create an ensembler for the router
# Note: Ensembling for Turing Routers is done through Standard, Docker or Pyfunc ensemblers, and its configuration
# is managed by the `RouterEnsemblerConfig` class. Three helper classes (child classes of `RouterEnsemblerConfig`)
# have been created to assist you in constructing these objects - `StandardRouterEnsemblerConfig`,
# `DockerRouterEnsemblerConfig` and `PyfuncRouterEnsemblerConfig`.
ensembler = DockerRouterEnsemblerConfig(
image="ealen/echo-server:0.5.1",
resource_request=ResourceRequest(
Expand Down Expand Up @@ -191,6 +222,10 @@ def main(turing_api: str, project: str):
)

# 1. Create a new router using the RouterConfig object
# Note: A `Router` object represents a router that is created on Turing API. It does not (and should not) ever be
# created manually by using its constructor directly. Instead, you should only be manipulating with `Router`
# instances that get returned as a result of using the various `Router` class and instance methods that interact
# with Turing API, such as the one below.
new_router = turing.Router.create(router_config)
print(f"1. You have created a router with id: {new_router.id}")

Expand Down Expand Up @@ -229,6 +264,14 @@ def main(turing_api: str, project: str):
print(f"4. You have just updated your router with a new config.")

# 5. List all the router config versions of your router
# Note: A `RouterVersion` represents a single version (and configuration) of a Turing Router. Just as `Router`
# objects, they should almost never be created manually by using their constructor.
#
# Besides assessing attributes of a `RouterVersion` object directly, which will allow you to access basic
# attributes, you may also consider retrieving the entire router configuration from a specific `RouterVersion`
# object as a `RouterConfig` for further manipulation by performing something like:
#
# `my_config = router_version.get_config()`
my_router_versions = my_router.list_versions()
print(f"5. You have just retrieved a list of {len(my_router_versions)} versions for your router:")
for ver in my_router_versions:
Expand Down
2 changes: 2 additions & 0 deletions sdk/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,6 @@
'dev': dev_requirements
},
python_requires='>=3.7',
long_description=pathlib.Path('./docs/README.md').read_text(),
long_descreiption_context_type='text/markdown'
deadlycoconuts marked this conversation as resolved.
Show resolved Hide resolved
)
2 changes: 1 addition & 1 deletion sdk/tests/router/config/traffic_rule_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ def test_create_payload_traffic_rule_condition(field, values, expected, request)
[
HeaderTrafficRuleCondition(
field="x-region",
values= ["region-a", "region-b"],
values=["region-a", "region-b"],
),
PayloadTrafficRuleCondition(
field="service_type.id",
Expand Down