Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
34bb28d
feat(xai): add grok-4.20 beta 2 models with pricing (#23900)
ishaan-jaff Mar 17, 2026
6b2e56f
docs: add Quick Install section for litellm --setup wizard (#23905)
ishaan-jaff Mar 17, 2026
212f29f
feat(setup): interactive setup wizard + install.sh (#23644)
ishaan-jaff Mar 17, 2026
2f7dcba
feat(ui): remove Chat UI page link and banner from sidebar and playgr…
ishaan-jaff Mar 17, 2026
d9a6036
feat(guardrails): MCPJWTSigner - built-in guardrail for zero trust MC…
ishaan-jaff Mar 18, 2026
fde9062
fix(ci): stabilize CI - formatting, type errors, test polling, securi…
cursoragent Mar 18, 2026
05f3ad4
chore: regenerate poetry.lock to sync with pyproject.toml
cursoragent Mar 18, 2026
2c02b68
Merge remote-tracking branch 'origin/main' into litellm_ishaan_march_17
cursoragent Mar 18, 2026
f6d53dc
fix: format merged files from main and regenerate poetry.lock
cursoragent Mar 18, 2026
cffc92b
fix(mypy): annotate jwt_claims as Optional[dict] to fix type incompat…
cursoragent Mar 18, 2026
951ecff
fix(ci): update router region test to use gpt-4.1-mini (fix flaky mod…
cursoragent Mar 18, 2026
399a120
ci: retry flaky logging_testing (async event loop race condition)
cursoragent Mar 18, 2026
233b1d3
fix(ci): aggregate all mock calls in langfuse e2e test to fix race co…
cursoragent Mar 18, 2026
b763c87
fix(ci): black formatting + update OpenAPI compliance tests for spec …
cursoragent Mar 18, 2026
98c1aa8
revert: undo incorrect Black 26.x formatting on litellm_logging.py
cursoragent Mar 18, 2026
83f7481
fix(ci): deduplicate and sort langfuse batch events after aggregation
cursoragent Mar 18, 2026
8c23f9f
Merge branch 'main' into litellm_ishaan_march_17
ishaan-jaff Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,11 @@ LiteLLM is a unified interface for 100+ LLM providers with two main components:
- **Check index coverage.** For new or modified queries, check `schema.prisma` for a supporting index. Prefer extending an existing index (e.g. `@@index([a])` → `@@index([a, b])`) over adding a new one, unless it's a `@@unique`. Only add indexes for large/frequent queries.
- **Keep schema files in sync.** Apply schema changes to all `schema.prisma` copies (`schema.prisma`, `litellm/proxy/`, `litellm-proxy-extras/`, `litellm-js/spend-logs/` for SpendLogs) with a migration under `litellm-proxy-extras/litellm_proxy_extras/migrations/`.

### Setup Wizard (`litellm/setup_wizard.py`)
- The wizard is implemented as a single `SetupWizard` class with `@staticmethod` methods — keep it that way. No module-level functions except `run_setup_wizard()` (the public entrypoint) and pure helpers (color, ANSI).
- Use `litellm.utils.check_valid_key(model, api_key)` for credential validation — never roll a custom completion call.
- Do not hardcode provider env-key names or model lists that already exist in the codebase. Add a `test_model` field to each provider entry to drive `check_valid_key`; set it to `None` for providers that can't be validated with a single API key (Azure, Bedrock, Ollama).

### Enterprise Features
- Enterprise-specific code in `enterprise/` directory
- Optional features enabled via environment variables
Expand Down
73 changes: 69 additions & 4 deletions docs/my-website/docs/proxy/docker_quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,76 @@ import Image from '@theme/IdealImage';
# Getting Started Tutorial

End-to-End tutorial for LiteLLM Proxy to:
- Add an Azure OpenAI model
- Make a successful /chat/completion call
- Generate a virtual key
- Set RPM limit on virtual key
- Add an Azure OpenAI model
- Make a successful /chat/completion call
- Generate a virtual key
- Set RPM limit on virtual key

## Quick Install (Recommended for local / beginners)

New to LiteLLM? This is the easiest way to get started locally. One command installs LiteLLM and walks you through setup interactively — no config files to write by hand.

### 1. Install

```bash
curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh
```

This detects your OS, installs `litellm[proxy]`, and drops you straight into the setup wizard.

Comment on lines +22 to +24
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Referenced scripts/install.sh does not exist

The install command instructs users to run:

curl -fsSL https://raw.githubusercontent.com/BerriAI/litellm/main/scripts/install.sh | sh

However, scripts/install.sh does not exist anywhere in this repository. A user following this documentation will get a curl error (the -f flag causes a silent failure on HTTP 404), resulting in nothing being installed with no useful error message to diagnose the problem.

This entire "Quick Install" section should only be merged after the referenced scripts/install.sh is created and committed.

### 2. Follow the wizard

```
$ litellm --setup

Welcome to LiteLLM

Choose your LLM providers
○ 1. OpenAI GPT-4o, GPT-4o-mini, o1
○ 2. Anthropic Claude Opus, Sonnet, Haiku
○ 3. Azure OpenAI GPT-4o via Azure
○ 4. Google Gemini Gemini 2.0 Flash, 1.5 Pro
○ 5. AWS Bedrock Claude, Llama via AWS
○ 6. Ollama Local models

❯ Provider(s): 1,2

❯ OpenAI API key: sk-...
❯ Anthropic API key: sk-ant-...

❯ Port [4000]:
❯ Master key [auto-generate]:

✔ Config saved → ./litellm_config.yaml

❯ Start the proxy now? (Y/n):
```

The wizard walks you through:
1. Pick your LLM providers (OpenAI, Anthropic, Azure, Bedrock, Gemini, Ollama)
2. Enter API keys for each provider
Comment on lines +30 to +55
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 litellm --setup wizard is not implemented

The docs present a detailed interactive wizard invoked via litellm --setup, but this flag/command does not exist in the litellm CLI (litellm/proxy/proxy_cli.py). Searching the entire codebase finds zero implementations of a --setup argument or an interactive provider wizard.

Users who install the package and then run litellm --setup will get an unrecognized argument error. This documentation should not be published until the wizard is actually implemented.

3. Set a port and master key (or accept the defaults)
4. Config is saved to `./litellm_config.yaml` and the proxy starts immediately

### 3. Make a call

Your proxy is running on `http://0.0.0.0:4000`. Test it:

```bash
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <your-master-key>' \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'
```

:::tip Already have pip installed?
You can skip the curl install and run `litellm --setup` directly after `pip install 'litellm[proxy]'`.
:::

---

## Pre-Requisites

Expand Down
Binary file added docs/my-website/img/mcp_zero_trust_gateway.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/my-website/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -672,6 +672,7 @@ const sidebars = {
"mcp_control",
"mcp_cost",
"mcp_guardrail",
"mcp_zero_trust",
"mcp_troubleshoot",
]
},
Expand Down
12 changes: 9 additions & 3 deletions litellm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1465,9 +1465,15 @@ def set_global_gitlab_config(config: Dict[str, Any]) -> None:
from .llms.petals.completion.transformation import PetalsConfig as PetalsConfig
from .llms.ollama.chat.transformation import OllamaChatConfig as OllamaChatConfig
from .llms.ollama.completion.transformation import OllamaConfig as OllamaConfig
from .llms.sagemaker.completion.transformation import SagemakerConfig as SagemakerConfig
from .llms.sagemaker.chat.transformation import SagemakerChatConfig as SagemakerChatConfig
from .llms.sagemaker.nova.transformation import SagemakerNovaConfig as SagemakerNovaConfig
from .llms.sagemaker.completion.transformation import (
SagemakerConfig as SagemakerConfig,
)
from .llms.sagemaker.chat.transformation import (
SagemakerChatConfig as SagemakerChatConfig,
)
from .llms.sagemaker.nova.transformation import (
SagemakerNovaConfig as SagemakerNovaConfig,
)
from .llms.cohere.chat.transformation import CohereChatConfig as CohereChatConfig
from .llms.anthropic.experimental_pass_through.messages.transformation import (
AnthropicMessagesConfig as AnthropicMessagesConfig,
Expand Down
8 changes: 6 additions & 2 deletions litellm/_logging.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@
"`litellm.set_verbose` is deprecated. Please set `os.environ['LITELLM_LOG'] = 'DEBUG'` for debug logs."
)

_ENABLE_SECRET_REDACTION = os.getenv("LITELLM_DISABLE_REDACT_SECRETS", "").lower() != "true"
_ENABLE_SECRET_REDACTION = (
os.getenv("LITELLM_DISABLE_REDACT_SECRETS", "").lower() != "true"
)

_REDACTED = "REDACTED"

Expand Down Expand Up @@ -199,7 +201,9 @@ def format(self, record):
json_record[key] = value

if record.exc_info:
json_record["stacktrace"] = record.exc_text or self.formatException(record.exc_info)
json_record["stacktrace"] = record.exc_text or self.formatException(
record.exc_info
)

return safe_dumps(json_record)

Expand Down
8 changes: 6 additions & 2 deletions litellm/cost_calculator.py
Original file line number Diff line number Diff line change
Expand Up @@ -1189,7 +1189,9 @@ def completion_cost( # noqa: PLR0915
and _usage["prompt_tokens_details"] != {}
and _usage["prompt_tokens_details"]
):
prompt_tokens_details = _usage.get("prompt_tokens_details") or {}
prompt_tokens_details = (
_usage.get("prompt_tokens_details") or {}
)
cache_read_input_tokens = prompt_tokens_details.get(
"cached_tokens", 0
)
Expand Down Expand Up @@ -1515,7 +1517,9 @@ def completion_cost( # noqa: PLR0915
if custom_llm_provider == "azure_ai":
model_for_additional_costs = request_model_for_cost
if completion_response is not None:
hidden_params = getattr(completion_response, "_hidden_params", None) or {}
hidden_params = (
getattr(completion_response, "_hidden_params", None) or {}
)
hidden_model = hidden_params.get("model") or hidden_params.get(
"litellm_model_name"
)
Expand Down
7 changes: 2 additions & 5 deletions litellm/integrations/focus/destinations/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,17 +59,14 @@ def _resolve_config(
return {k: v for k, v in resolved.items() if v is not None}
if provider == "vantage":
resolved = {
"api_key": overrides.get("api_key")
or os.getenv("VANTAGE_API_KEY"),
"api_key": overrides.get("api_key") or os.getenv("VANTAGE_API_KEY"),
"integration_token": overrides.get("integration_token")
or os.getenv("VANTAGE_INTEGRATION_TOKEN"),
"base_url": overrides.get("base_url")
or os.getenv("VANTAGE_BASE_URL", "https://api.vantage.sh"),
}
if not resolved.get("api_key"):
raise ValueError(
"VANTAGE_API_KEY must be provided for Vantage exports"
)
raise ValueError("VANTAGE_API_KEY must be provided for Vantage exports")
if not resolved.get("integration_token"):
raise ValueError(
"VANTAGE_INTEGRATION_TOKEN must be provided for Vantage exports"
Expand Down
6 changes: 3 additions & 3 deletions litellm/integrations/langfuse/langfuse_prompt_management.py
Original file line number Diff line number Diff line change
Expand Up @@ -340,9 +340,9 @@ async def async_log_failure_event(self, kwargs, response_obj, start_time, end_ti
)
status_message = str(kwargs.get("exception", "Unknown error"))
if standard_logging_object is not None:
status_message = standard_logging_object.get(
"error_str", None
) or status_message
status_message = (
standard_logging_object.get("error_str", None) or status_message
)
langfuse_logger_to_use.log_event_on_langfuse(
start_time=start_time,
end_time=end_time,
Expand Down
8 changes: 4 additions & 4 deletions litellm/integrations/vantage/vantage_logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,9 @@ def __init__(

verbose_logger.debug(
"VantageLogger initialized (integration_token=%s)",
resolved_token[:4] + "***" if resolved_token and len(resolved_token) > 4 else "***",
resolved_token[:4] + "***"
if resolved_token and len(resolved_token) > 4
else "***",
)

async def initialize_focus_export_job(self) -> None:
Expand Down Expand Up @@ -128,9 +130,7 @@ async def init_vantage_background_job(
callback_type=VantageLogger
)
if not vantage_loggers:
verbose_logger.debug(
"No Vantage logger registered; skipping scheduler"
)
verbose_logger.debug("No Vantage logger registered; skipping scheduler")
return

vantage_logger = cast(VantageLogger, vantage_loggers[0])
Expand Down
5 changes: 3 additions & 2 deletions litellm/litellm_core_utils/default_encoding.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@
else:
cache_dir = filename

os.environ["TIKTOKEN_CACHE_DIR"] = cache_dir # use local copy of tiktoken b/c of - https://github.com/BerriAI/litellm/issues/1071
os.environ[
"TIKTOKEN_CACHE_DIR"
] = cache_dir # use local copy of tiktoken b/c of - https://github.com/BerriAI/litellm/issues/1071

import tiktoken
import time
Expand All @@ -48,4 +50,3 @@
# Exponential backoff with jitter to reduce collision probability
delay = _retry_delay * (2**attempt) + random.uniform(0, 0.1)
time.sleep(delay)

Loading
Loading