GitHub

DynaPipe: Dynamic Layer Redistribution for Efficient Serving of LLMs with Pipeline Parallelism

What is DynaPipe?

We investigate a critical yet underexplored issue: the pipeline inter-stage bubble problem introduced by sampling operations. To address this challenge, we propose DynaPipe, a novel runtime dynamic layer redistribution scheme. By adaptively adjusting the computational load across pipeline stages, DynaPipe ensures more balanced task distribution, effectively aligning the pipeline and mitigating inter-stage imbalances. Compared with state-of-the-art pipeline inference frameworks, DynaPipe achieves notable performance gains and significantly improves overall efficiency.

Install DynaPipe

pip install --verbose -e .

Launch online serving

# To enable prefix caching, add "--enable-prefix-caching"
# To enable pipeline parallelism, add "--pp $PP_DEGREE"
python -m gllm.entrypoints.api_server --port $PORT --model-path $MODEL_PATH --enable-adjust-ayers

Online benchmark with gllm or vllm

python benchmarks/benchmark_serving.py --backend $BACKEND --model $MODEL \
        --dataset-name $DATASET_NAME --dataset-path $DATASET_PATH \
        --num-prompts $NUM_PROMPTS --port $PORT --trust-remote-code \
        --request-rate $REQUEST_RATE

Acknowledgements

This project builds upon the foundational work of gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
benchmarks		benchmarks
cmake		cmake
csrc		csrc
doc		doc
evaluations		evaluations
examples		examples
gllm		gllm
.DS_Store		.DS_Store
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DynaPipe: Dynamic Layer Redistribution for Efficient Serving of LLMs with Pipeline Parallelism

What is DynaPipe?

Install DynaPipe

Launch online serving

Online benchmark with gllm or vllm

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

MachineLearningSystem/DynaPipe

Folders and files

Latest commit

History

Repository files navigation

DynaPipe: Dynamic Layer Redistribution for Efficient Serving of LLMs with Pipeline Parallelism

What is DynaPipe?

Install DynaPipe

Launch online serving

Online benchmark with gllm or vllm

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages