Skip to content

Commit

Permalink
Update the doc for OpenAI models
Browse files Browse the repository at this point in the history
  • Loading branch information
rlouf committed Jun 14, 2024
1 parent ed8e6ec commit 320af22
Show file tree
Hide file tree
Showing 3 changed files with 129 additions and 41 deletions.
4 changes: 1 addition & 3 deletions docs/reference/models/mlxlm.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ model = models.mlxlm("mlx-community/mlx-community/Meta-Llama-3-8B-Instruct-8bit"

With the loaded model, you can generate text or perform structured generation, e.g.

```python3
```python
from outlines import models, generate

model = models.mlxlm("mlx-community/Meta-Llama-3-8B-Instruct-8bit")
Expand All @@ -28,5 +28,3 @@ model_output = generator("What's Jennys Number?\n")
print(model_output)
# '8675309'
```

For more examples, see the [cookbook](cookbook/index.md).
147 changes: 118 additions & 29 deletions docs/reference/models/openai.md
Original file line number Diff line number Diff line change
@@ -1,81 +1,170 @@
# Generate text with the OpenAI and compatible APIs
# OpenAI and compatible APIs

!!! Installation

You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines.

Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines:
## OpenAI models

Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. You can initialize the model by passing the model name to `outlines.models.openai`:

```python
from outlines import models


model = models.openai("gpt-3.5-turbo")
model = models.openai("gpt-4")
model = models.openai("gpt-4-turbo")
model = models.openai("gpt-4o")
```

Check the [OpenAI documentation](https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4) for an up-to-date list of available models. You can pass any parameter you would pass to `openai.AsyncOpenAI` as keyword arguments:

```python
import os
from outlines import models


print(type(model))
# OpenAI
model = models.openai(
"gpt-3.5-turbo",
api_key=os.environ("OPENAI_API_KEY")
)
```

Outlines also supports Azure OpenAI models:
The following table enumerates the possible parameters. Refer to the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/_client.py) for an up-to-date list.

**Parameters:**

| **Parameters** | **Type** | **Description** | **Default** |
|----------------|:---------|:----------------|:------------|
| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` |
| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` |
| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` |
| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if no specified. | `None` |
| `timeout` | `float` | Request timeout.| `NOT_GIVEN` |
| `max_retries` | `int` | Maximum number of retries for failing requests | `2` |
| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` |
| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` |
| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` |

## Azure OpenAI models

Outlines also supports Azure OpenAI models:

```python
from outlines import models


model = models.azure_openai(
"azure-deployment-name",
"gpt-3.5-turbo",
api_version="2023-07-01-preview",
azure_endpoint="https://example-endpoint.openai.azure.com",
)
```

More generally, you can use any API client compatible with the OpenAI interface by passing an instance of the client, a configuration, and optionally the corresponding tokenizer (if you want to be able to use `outlines.generate.choice`):
!!! Question "Why do I need to specify model and deployment name?"

```python
from openai import AsyncOpenAI
import tiktoken
The model name is needed to load the correct tokenizer for the model. The tokenizer is necessary for structured generation.

from outlines.models.openai import OpenAI, OpenAIConfig

config = OpenAIConfig(model="gpt-4")
client = AsyncOpenAI()
tokenizer = tiktoken.encoding_for_model("gpt-4")
You can pass any parameter you would pass to `openai.AsyncAzureOpenAI`. You can consult the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/lib/azure.py) for an up-to-date list.

model = OpenAI(client, config, tokenizer)
```
**Parameters:**


## Monitoring API use
| **Parameters** | **Type** | **Description** | **Default** |
|----------------|:---------|:----------------|:------------|
| `azure_endpoint` | `str` | Azure endpoint, including the resource. Infered from `AZURE_OPENAI_ENDPOINT` if not specified | `None` |
| `api_version` | `str` | API version. Infered from `AZURE_OPENAI_API_KEY` if not specified | `None` |
| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` |
| `azure_ad_token` | `str` | Azure active directory token. Inference from `AZURE_OPENAI_AD_TOKEN` if not specified | `None` |
| `azure_ad_token_provider` | `AzureADTokenProvider` | A function that returns an Azure Active Directory token | `None` |
| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` |
| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` |
| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if not specified. | `None` |
| `timeout` | `float` | Request timeout.| `NOT_GIVEN` |
| `max_retries` | `int` | Maximum number of retries for failing requests | `2` |
| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` |
| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` |
| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` |

It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:
## Models that follow the OpenAI standard

```python
import outlines.models
Outlines supports models that follow the OpenAI standard. You will need to initialize the OpenAI client properly configured and pass it to `outlines.models.openai`

model = models.openai("gpt-4")
```python
import os
from openai import AsyncOpenAI
from outlines import models
from outlines.models.openai import OpenAIConfig

print(model.prompt_tokens)
# 0

print(model.completion_tokens)
# 0
client = AsyncOpenAI(
api_key=os.environ.get("PROVIDER_KEY"),
base_url="http://other.provider.server.com"
)
config = OpenAIConfig("model_name")
model = models.openai(client, config)
```

These numbers are updated every time you call the model.
!!! Warning

You need to pass the async client to be able to do batch inference.

## Advanced configuration

For more advanced configuration option, such as support proxy, please consult the [OpenAI SDK's documentation](https://github.com/openai/openai-python):


```python
from openai import AsyncOpenAI, DefaultHttpxClient
from outlines import models
from outlines.models.openai import OpenAIConfig


## Advanced usage
client = AsyncOpenAI(
base_url="http://my.test.server.example.com:8083",
http_client=DefaultHttpxClient(
proxies="http://my.test.proxy.example.com",
transport=httpx.HTTPTransport(local_address="0.0.0.0"),
),
)
config = OpenAIConfig("model_name")
model = models.openai(client, config)
```

It is possible to specify the values for `seed`, `presence_penalty`, `frequence_penalty`, `top_p` by passing an instance of `OpenAIConfig` when initializing the model:

```python
from outlines.models.openai import OpenAIConfig
from outlines import models


config = OpenAIConfig(
presence_penalty=1.,
frequence_penalty=1.,
frequency_penalty=1.,
top_p=.95,
seed=0,
)
model = models.openai("gpt-4", config=config)
model = models.openai("gpt-3.5-turbo", config)
```

## Monitoring API use

It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance:

```python
from openai import AsyncOpenAI
import outlines.models


model = models.openai("gpt-4")

print(model.prompt_tokens)
# 0

print(model.completion_tokens)
# 0
```

These numbers are updated every time you call the model.
19 changes: 10 additions & 9 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,15 +122,16 @@ nav:
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
- vLLM: reference/models/vllm.md
- Llama.cpp: reference/models/llamacpp.md
- Transformers: reference/models/transformers.md
- MLX: reference/models/mlxlm.md
- ExllamaV2: reference/models/exllamav2.md
- Mamba: reference/models/mamba.md
- OpenAI: reference/models/openai.md
- TGI: reference/models/tgi.md

- Open source:
- Transformers: reference/models/transformers.md
- Llama.cpp: reference/models/llamacpp.md
- vLLM: reference/models/vllm.md
- TGI: reference/models/tgi.md
- ExllamaV2: reference/models/exllamav2.md
- MLX: reference/models/mlxlm.md
- Mamba: reference/models/mamba.md
- API:
- OpenAI: reference/models/openai.md
- API Reference:
- api/index.md
- api/models.md
Expand Down

0 comments on commit 320af22

Please sign in to comment.