Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 0 additions & 51 deletions .github/workflows/docs.yml

This file was deleted.

2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ repos:
rev: 7.1.1
hooks:
- id: flake8
args: ["--max-line-length=88", "--extend-ignore=E203,W503"]
args: ["--max-line-length=160", "--extend-ignore=E203,W503"]
10 changes: 1 addition & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,15 +100,7 @@ uv pip install -e .

## Run examples (Qwen2.5-omni)

Get into the example folder
```bash
cd examples/offline_inference/qwen_2_5_omni
```
Modify PYTHONPATH in run.sh as your path of vllm_omni. Then run.
```bash
bash run.sh
```
The output audio is saved in ./output_audio
Please check the folder of [examples](examples)

## Further details

Expand Down
6 changes: 3 additions & 3 deletions docs/DOCS_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,15 +40,15 @@ Example docstring:
```python
class OmniLLM:
"""Main entry point for vLLM-omni inference.

This class provides a high-level interface for running multi-modal
inference with non-autoregressive models.

Args:
model: Model name or path
stage_configs: Optional stage configurations
**kwargs: Additional arguments passed to the engine

Example:
>>> llm = OmniLLM(model="Qwen/Qwen2.5-Omni")
>>> outputs = llm.generate(prompts="Hello")
Expand Down
1 change: 0 additions & 1 deletion docs/api/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,3 @@ Worker classes and model runners for distributed inference.
- [vllm_omni.worker.gpu_model_runner.OmniGPUModelRunner][]
- [vllm_omni.worker.gpu_ar_model_runner.GPUARModelRunner][]
- [vllm_omni.worker.gpu_diffusion_model_runner.GPUDiffusionModelRunner][]

2 changes: 1 addition & 1 deletion docs/contributing/design_documents/api_design_doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This document contains comprehensive API design documentation for all core modules in vLLM-omni. These templates provide a standardized structure for designing and implementing the core, engine, executor, and worker modules.

## 📋 Module API
## 📋 Module API

### Core Module API
**Core module** provides fundamental scheduling, caching, and resource management functionality.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Architecture Overview
# Architecture Overview

# Introduction

Expand Down
1 change: 0 additions & 1 deletion docs/contributing/design_documents/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,3 @@ This section contains design documents and architecture specifications for vLLM-
## API Design Documentation

- [vLLM-omni API Documentation](api_design_doc.md)

3 changes: 1 addition & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,5 @@

## <span class="twemoji">📚</span> Documentation Navigation

- To run open-source models on vLLM-Omni, we recommend starting with the [:material-code-tags: User Quide](user_guide/getting_started/quickstart.md)
- To run open-source models on vLLM-Omni, we recommend starting with the [:material-code-tags: User Quide](user_guide/getting_started/quickstart.md)
- To develop and contribute to vLLM-Omni, we recommend starting with the [:material-tools: Developer Guide](contributing/README.md)

21 changes: 15 additions & 6 deletions docs/mkdocs/hooks/generate_examples.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,11 +103,20 @@ def determine_other_files(self) -> list[Path]:
if self.path.is_file():
return []
# Text file extensions that can be safely included
text_extensions = {".py", ".md", ".sh", ".yaml", ".yml", ".json", ".txt", ".toml", ".cfg", ".ini"}
is_other_file = lambda file: (
file.is_file()
and file != self.main_file
and file.suffix in text_extensions
text_extensions = {
".py",
".md",
".sh",
".yaml",
".yml",
".json",
".txt",
".toml",
".cfg",
".ini",
}
is_other_file = lambda file: ( # noqa: E731
file.is_file() and file != self.main_file and file.suffix in text_extensions
)
return [file for file in self.path.rglob("*") if is_other_file(file)]

Expand Down Expand Up @@ -172,7 +181,7 @@ def generate(self) -> str:
f"{code_fence}\n"
)
else:
with open(self.main_file) as f:
with open(self.main_file, encoding="utf-8") as f:
# Skip the title from md snippets as it's been included above
main_content = f.readlines()[1:]
content += self.fix_relative_links("".join(main_content))
Expand Down
1 change: 0 additions & 1 deletion docs/mkdocs/stylesheets/extra.css
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,3 @@ body[data-md-color-scheme="slate"] .md-nav__item--section > label.md-nav__link .
.md-nav__item--section:has([href*="api/vllm_omni/index"]) > .md-nav > .md-nav__list > .md-nav__item--nested > .md-nav > .md-nav__list > .md-nav__item--nested.md-nav__item--active > .md-nav {
display: block;
}

1 change: 0 additions & 1 deletion docs/user_guide/examples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,3 @@ vLLM-omni's examples are split into two categories:
- If you are using vLLM-omni from an HTTP application or client, see the *Online Serving* section.

For detailed example documentation, check the [examples directory](https://github.com/vllm-project/vllm-omni/tree/main/examples) in the repository.

64 changes: 64 additions & 0 deletions docs/user_guide/examples/offline_inference/qwen2_5_omni.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Offline Example of vLLM-omni for Qwen2.5-omni

Source <https://github.com/vllm-project/vllm-omni/tree/main/examples\offline_inference\qwen2_5_omni>.


## 🛠️ Installation

Please refer to [README.md](https://github.com/vllm-project/vllm-omni/tree/main/README.md)

## Run examples (Qwen2.5-omni)
### Multiple Prompts
Download dataset from [seed_tts](https://drive.google.com/file/d/1GlSjVfSHkW3-leKKBlfrjuuTGqQ_xaLP/edit). To get the prompt, you can:
```bash
tar -xf <Your Download Path>/seedtts_testset.tar
cp seedtts_testset/en/meta.lst examples/offline_inference/qwen2_5_omni/meta.lst
python3 examples/offline_inference/qwen2_5_omni/extract_prompts.py \
--input examples/offline_inference/qwen2_5_omni/meta.lst \
--output examples/offline_inference/qwen2_5_omni/top100.txt \
--topk 100
```
Get into the example folder
```bash
cd examples/offline_inference/qwen2_5_omni
```
Then run the command below.
```bash
bash run_multiple_prompts.sh
```
### Single Prompts
Get into the example folder
```bash
cd examples/offline_inference/qwen2_5_omni
```
Then run the command below.
```bash
bash run_single_prompt.sh
```

## Example materials

??? abstract "end2end.py"
``````py
--8<-- "examples\offline_inference\qwen2_5_omni\end2end.py"
``````
??? abstract "extract_prompts.py"
``````py
--8<-- "examples\offline_inference\qwen2_5_omni\extract_prompts.py"
``````
??? abstract "processing_omni.py"
``````py
--8<-- "examples\offline_inference\qwen2_5_omni\processing_omni.py"
``````
??? abstract "run_multiple_prompts.sh"
``````sh
--8<-- "examples\offline_inference\qwen2_5_omni\run_multiple_prompts.sh"
``````
??? abstract "run_single_prompt.sh"
``````sh
--8<-- "examples\offline_inference\qwen2_5_omni\run_single_prompt.sh"
``````
??? abstract "utils.py"
``````py
--8<-- "examples\offline_inference\qwen2_5_omni\utils.py"
``````
35 changes: 35 additions & 0 deletions docs/user_guide/examples/online_serving/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Online serving Example of vLLM-omni for Qwen2.5-omni

Source <https://github.com/vllm-project/vllm-omni/blob/main/examples\online_serving\README.md>.


## 🛠️ Installation

Please refer to [README.md](https://github.com/vllm-project/vllm-omni/blob/main/README.md)

## Run examples (Qwen2.5-omni)

Launch the server
```bash
vllm serve Qwen/Qwen2.5-Omni-7B --omni --port 8091
```

If you have custom stage configs file, launch the server with command below
```bash
vllm serve Qwen/Qwen2.5-Omni-7B --omni --port 8091 --stage-configs-path /path/to/stage_configs_file
```

Get into the example folder
```bash
cd examples/online_serving
```

Send request via python
```bash
python openai_chat_completion_client_for_multimodal_generation.py
```

Send request via curl
```bash
bash run_curl_multimodal_generation.sh
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# OpenAI Chat Completion Client For Multimodal Generation

Source <https://github.com/vllm-project/vllm-omni/blob/main/examples\online_serving\openai_chat_completion_client_for_multimodal_generation.py>.

``````py
--8<-- "examples\online_serving\openai_chat_completion_client_for_multimodal_generation.py"
``````
1 change: 0 additions & 1 deletion docs/user_guide/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,3 @@ The output audio is saved in ./output_audio
- Read the [architecture documentation](../../contributing/design_documents/vllm_omni_design.md)
- Check out the [API reference](../../api/overview.md)
- Explore the [examples](../examples/index.md)

34 changes: 34 additions & 0 deletions examples/offline_inference/qwen2_5_omni/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Offline Example of vLLM-omni for Qwen2.5-omni

## 🛠️ Installation

Please refer to [README.md](../../../README.md)

## Run examples (Qwen2.5-omni)
### Multiple Prompts
Download dataset from [seed_tts](https://drive.google.com/file/d/1GlSjVfSHkW3-leKKBlfrjuuTGqQ_xaLP/edit). To get the prompt, you can:
```bash
tar -xf <Your Download Path>/seedtts_testset.tar
cp seedtts_testset/en/meta.lst examples/offline_inference/qwen2_5_omni/meta.lst
python3 examples/offline_inference/qwen2_5_omni/extract_prompts.py \
--input examples/offline_inference/qwen2_5_omni/meta.lst \
--output examples/offline_inference/qwen2_5_omni/top100.txt \
--topk 100
```
Get into the example folder
```bash
cd examples/offline_inference/qwen2_5_omni
```
Then run the command below.
```bash
bash run_multiple_prompts.sh
```
### Single Prompts
Get into the example folder
```bash
cd examples/offline_inference/qwen2_5_omni
```
Then run the command below.
```bash
bash run_single_prompt.sh
```
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
import soundfile as sf
import torch
from utils import make_omni_prompt
from vllm.sampling_params import SamplingParams

from vllm.sampling_params import SamplingParams
from vllm_omni.entrypoints.omni_llm import OmniLLM

_os_env_toggle.environ["VLLM_USE_V1"] = "1"
Expand Down Expand Up @@ -109,10 +109,14 @@ def parse_args():
parser.add_argument("--use-torchvision", action="store_true")
parser.add_argument("--tokenize", action="store_true")
parser.add_argument(
"--output-wav", default="output.wav", help="[Deprecated] Output wav directory (use --output-dir)."
"--output-wav",
default="output.wav",
help="[Deprecated] Output wav directory (use --output-dir).",
)
parser.add_argument(
"--output-dir", default="outputs", help="Output directory to save text and wav files together."
"--output-dir",
default="outputs",
help="Output directory to save text and wav files together.",
)
parser.add_argument(
"--thinker-hidden-states-dir",
Expand Down Expand Up @@ -168,7 +172,9 @@ def main():
raise

if args.prompts is None:
raise ValueError("No prompts provided. Use --prompts ... or --txt-prompts <file.txt> (with --prompt_type text)")
raise ValueError(
"No prompts provided. Use --prompts ... or --txt-prompts <file.txt> (with --prompt_type text)"
)
omni_llm = OmniLLM(
model=model_name,
log_stats=args.enable_stats,
Expand Down Expand Up @@ -217,7 +223,9 @@ def main():
omni_outputs = omni_llm.generate(prompt, sampling_params_list)

# Determine output directory: prefer --output-dir; fallback to --output-wav
output_dir = args.output_dir if getattr(args, "output_dir", None) else args.output_wav
output_dir = (
args.output_dir if getattr(args, "output_dir", None) else args.output_wav
)
os.makedirs(output_dir, exist_ok=True)
for stage_outputs in omni_outputs:
if stage_outputs.final_output_type == "text":
Expand Down
Loading