Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
34bb28d
feat(xai): add grok-4.20 beta 2 models with pricing (#23900)
ishaan-jaff Mar 17, 2026
6b2e56f
docs: add Quick Install section for litellm --setup wizard (#23905)
ishaan-jaff Mar 17, 2026
212f29f
feat(setup): interactive setup wizard + install.sh (#23644)
ishaan-jaff Mar 17, 2026
2f7dcba
feat(ui): remove Chat UI page link and banner from sidebar and playgr…
ishaan-jaff Mar 17, 2026
d9a6036
feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MC…
ishaan-jaff Mar 18, 2026
fde9062
fix(ci): stabilize CI - formatting, type errors, test polling, securi…
cursoragent Mar 18, 2026
05f3ad4
chore: regenerate poetry.lock to sync with pyproject.toml
cursoragent Mar 18, 2026
2c02b68
Merge remote-tracking branch 'origin/main' into litellm_ishaan_march_17
cursoragent Mar 18, 2026
f6d53dc
fix: format merged files from main and regenerate poetry.lock
cursoragent Mar 18, 2026
cffc92b
fix(mypy): annotate jwt_claims as Optional[dict] to fix type incompat…
cursoragent Mar 18, 2026
951ecff
fix(ci): update router region test to use gpt-4.1-mini (fix flaky mod…
cursoragent Mar 18, 2026
399a120
ci: retry flaky logging_testing (async event loop race condition)
cursoragent Mar 18, 2026
233b1d3
fix(ci): aggregate all mock calls in langfuse e2e test to fix race co…
cursoragent Mar 18, 2026
b763c87
fix(ci): black formatting + update OpenAPI compliance tests for spec …
cursoragent Mar 18, 2026
98c1aa8
revert: undo incorrect Black 26.x formatting on litellm_logging.py
cursoragent Mar 18, 2026
83f7481
fix(ci): deduplicate and sort langfuse batch events after aggregation
cursoragent Mar 18, 2026
8c23f9f
Merge branch 'main' into litellm_ishaan_march_17
ishaan-jaff Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,11 @@ LiteLLM is a unified interface for 100+ LLM providers with two main components:
- **Check index coverage.** For new or modified queries, check `schema.prisma` for a supporting index. Prefer extending an existing index (e.g. `@@index([a])` → `@@index([a, b])`) over adding a new one, unless it's a `@@unique`. Only add indexes for large/frequent queries.
- **Keep schema files in sync.** Apply schema changes to all `schema.prisma` copies (`schema.prisma`, `litellm/proxy/`, `litellm-proxy-extras/`, `litellm-js/spend-logs/` for SpendLogs) with a migration under `litellm-proxy-extras/litellm_proxy_extras/migrations/`.

### Setup Wizard (`litellm/setup_wizard.py`)
- The wizard is implemented as a single `SetupWizard` class with `@staticmethod` methods — keep it that way. No module-level functions except `run_setup_wizard()` (the public entrypoint) and pure helpers (color, ANSI).
- Use `litellm.utils.check_valid_key(model, api_key)` for credential validation — never roll a custom completion call.
- Do not hardcode provider env-key names or model lists that already exist in the codebase. Add a `test_model` field to each provider entry to drive `check_valid_key`; set it to `None` for providers that can't be validated with a single API key (Azure, Bedrock, Ollama).

### Enterprise Features
- Enterprise-specific code in `enterprise/` directory
- Optional features enabled via environment variables
Expand Down
73 changes: 69 additions & 4 deletions docs/my-website/docs/proxy/docker_quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,76 @@ import TabItem from '@theme/TabItem';
# Getting Started Tutorial

End-to-End tutorial for LiteLLM Proxy to:
- Add an Azure OpenAI model
- Make a successful /chat/completion call
- Generate a virtual key
- Set RPM limit on virtual key
- Add an Azure OpenAI model
- Make a successful /chat/completion call
- Generate a virtual key
- Set RPM limit on virtual key

## Quick Install (Recommended for local / beginners)

New to LiteLLM? This is the easiest way to get started locally. One command installs LiteLLM and walks you through setup interactively — no config files to write by hand.

### 1. Install

```bash
curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh
```

This detects your OS, installs `litellm[proxy]`, and drops you straight into the setup wizard.

Comment on lines +22 to +24

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Referenced scripts/install.sh does not exist

The install command instructs users to run:

curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh

However, scripts/install.sh does not exist anywhere in this repository. A user following this documentation will get a curl error (the -f flag causes a silent failure on HTTP 404), resulting in nothing being installed with no useful error message to diagnose the problem.

This entire "Quick Install" section should only be merged after the referenced scripts/install.sh is created and committed.

### 2. Follow the wizard

```
$ litellm --setup

Welcome to LiteLLM

Choose your LLM providers
○ 1. OpenAI GPT-4o, GPT-4o-mini, o1
○ 2. Anthropic Claude Opus, Sonnet, Haiku
○ 3. Azure OpenAI GPT-4o via Azure
○ 4. Google Gemini Gemini 2.0 Flash, 1.5 Pro
○ 5. AWS Bedrock Claude, Llama via AWS
○ 6. Ollama Local models

❯ Provider(s): 1,2

❯ OpenAI API key: sk-...
❯ Anthropic API key: sk-ant-...

❯ Port [4000]:
❯ Master key [auto-generate]:

✔ Config saved → ./litellm_config.yaml

❯ Start the proxy now? (Y/n):
```

The wizard walks you through:
1. Pick your LLM providers (OpenAI, Anthropic, Azure, Bedrock, Gemini, Ollama)
2. Enter API keys for each provider
Comment on lines +30 to +55

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 litellm --setup wizard is not implemented

The docs present a detailed interactive wizard invoked via litellm --setup, but this flag/command does not exist in the litellm CLI (litellm/proxy/proxy_cli.py). Searching the entire codebase finds zero implementations of a --setup argument or an interactive provider wizard.

Users who install the package and then run litellm --setup will get an unrecognized argument error. This documentation should not be published until the wizard is actually implemented.

3. Set a port and master key (or accept the defaults)
4. Config is saved to `./litellm_config.yaml` and the proxy starts immediately

### 3. Make a call

Your proxy is running on `http://0.0.0.0:4000`. Test it:

```bash
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <your-master-key>' \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```

:::tip Already have pip installed?
You can skip the curl install and run `litellm --setup` directly after `pip install 'litellm[proxy]'`.
:::

---

## Pre-Requisites

Expand Down
47 changes: 47 additions & 0 deletions litellm/model_prices_and_context_window_backup.json
Original file line number Diff line number Diff line change
Expand Up @@ -32354,6 +32354,53 @@
"supports_vision": true,
"supports_web_search": true
},
"xai/grok-4.20-multi-agent-beta-0309": {
"cache_read_input_token_cost": 2e-07,
"input_cost_per_token": 2e-06,
"litellm_provider": "xai",
"max_input_tokens": 2000000,
"max_output_tokens": 2000000,
"max_tokens": 2000000,
"mode": "chat",
"output_cost_per_token": 6e-06,
"source": "https://docs.x.ai/docs/models",
"supports_function_calling": true,
"supports_reasoning": true,
"supports_tool_choice": true,
"supports_vision": true,
"supports_web_search": true
},
"xai/grok-4.20-beta-0309-reasoning": {
"cache_read_input_token_cost": 2e-07,
"input_cost_per_token": 2e-06,
"litellm_provider": "xai",
"max_input_tokens": 2000000,
"max_output_tokens": 2000000,
"max_tokens": 2000000,
"mode": "chat",
"output_cost_per_token": 6e-06,
"source": "https://docs.x.ai/docs/models",
"supports_function_calling": true,
"supports_reasoning": true,
"supports_tool_choice": true,
"supports_vision": true,
"supports_web_search": true
},
"xai/grok-4.20-beta-0309-non-reasoning": {
"cache_read_input_token_cost": 2e-07,
"input_cost_per_token": 2e-06,
"litellm_provider": "xai",
"max_input_tokens": 2000000,
"max_output_tokens": 2000000,
"max_tokens": 2000000,
"mode": "chat",
"output_cost_per_token": 6e-06,
"source": "https://docs.x.ai/docs/models",
"supports_function_calling": true,
"supports_tool_choice": true,
"supports_vision": true,
"supports_web_search": true
},
"xai/grok-beta": {
"input_cost_per_token": 5e-06,
"litellm_provider": "xai",
Expand Down
15 changes: 14 additions & 1 deletion litellm/proxy/proxy_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -468,6 +468,12 @@ def _maybe_setup_prometheus_multiproc_dir(
type=str,
help="Path to the logging configuration file",
)
@click.option(
"--setup",
is_flag=True,
default=False,
help="Run the interactive setup wizard to configure providers and generate a config file",
)
@click.option(
"--version",
"-v",
Expand Down Expand Up @@ -598,6 +604,7 @@ def run_server( # noqa: PLR0915
num_requests,
use_queue,
health,
setup,
version,
run_gunicorn,
run_hypercorn,
Expand All @@ -611,6 +618,12 @@ def run_server( # noqa: PLR0915
max_requests_before_restart,
enforce_prisma_migration_check: bool,
):
if setup:
from litellm.setup_wizard import run_setup_wizard

run_setup_wizard()
return

args = locals()
if local:
from proxy_server import (
Expand Down Expand Up @@ -904,7 +917,7 @@ def run_server( # noqa: PLR0915
# Auto-create PROMETHEUS_MULTIPROC_DIR for multi-worker setups
ProxyInitializationHelpers._maybe_setup_prometheus_multiproc_dir(
num_workers=num_workers,
litellm_settings=litellm_settings if config else None,
litellm_settings=litellm_settings if config else None, # type: ignore[possibly-unbound]
)

# --- SEPARATE HEALTH APP LOGIC ---
Expand Down
Loading
Loading