diff --git a/docs/examples/README.md b/docs/examples/README.md index e4cd1c87f4ea..9d6126a65c41 100644 --- a/docs/examples/README.md +++ b/docs/examples/README.md @@ -1,7 +1,17 @@ # Examples -vLLM's examples are split into three categories: +vLLM's examples are organized into the following categories: -- If you are using vLLM from within Python code, see the [Offline Inference](.) section. -- If you are using vLLM from an HTTP application or client, see the [Online Serving](.) section. -- For examples of using some of vLLM's advanced features (e.g. LMCache or Tensorizer) which are not specific to either of the above use cases, see the [Others](.) section. +- **[`basic/`](../../examples/basic)** – Minimal examples for offline inference and online serving. +- **[`generate/`](../../examples/generate)** – Text generation examples, including multimodal models. +- **[`pooling/`](../../examples/pooling)** – Examples for embedding, classification, scoring, reward, etc. +- **[`speech_to_text/`](../../examples/speech_to_text)** – Speech transcription, translation and real-time audio examples. +- **[`features/`](../../examples/features)** – Demonstrations of individual vLLM features: automatic prefix caching, speculative decoding, LoRA, structured outputs, prompt embedding, pause/resume, batch invariance, KV events, data parallelism, and more. +- **[`reasoning/`](../../examples/reasoning)** – Examples for reasoning with vLLM. +- **[`tool_calling/`](../../examples/tool_calling)** – Examples for function/tool calling with vLLM. +- **[`applications/`](../../examples/applications)** – Application examples such as chatbots and RAG (Retrieval-Augmented Generation). +- **[`rl/`](../../examples/rl)** – Reinforcement learning examples. +- **[`deployment/`](../../examples/deployment)** – Examples for deploying vLLM in production. +- **[`ray_serving/`](../../examples/ray_serving)** – Scalable serving using Ray. +- **[`disaggregated/`](../../examples/disaggregated)** – Examples for disaggregated serving (separate prefill and decode), including various kv cache connectors (LMCache, Mooncake, FlexKV, P2P NCCL) and failure recovery. +- **[`observability/`](../../examples/observability)** – Metrics, logging, tracing (OpenTelemetry), and dashboards (Grafana, Perses).