Skip to content

sgl-project/sgl-learning-materials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Materials for learning SGLang

Please join our Slack Channel https://slack.sglang.ai. For enterprises interested in adopting or deploying SGLang at scale, including technical consulting, sponsorship opportunities, or partnership inquiries, please contact us at [email protected].

Announcement

February 2025

The SGLang Team is honored to announce that the following well-known companies and teams, among others, have adopted SGLang for running DeepSeek V3 and R1. AMD, NVIDIA, Microsoft Azure, Baseten, Novita AI, ByteDance Volcengine, DataCrunch, Hyperbolic, Vultr, RunPod and so on.

December 2024

🎉 Through dedicated efforts from July to December 2024, the SGLang team has achieved significant milestones with three major releases: v0.2, v0.3, and v0.4. For detailed optimization insights, please refer to our corresponding blog posts.

🚀 We're proud to announce that SGLang has been adopted as:

  • The dominant LLM engine by AMD
  • The default LLM engine for xAI

For more information, please check out AMD's ROCm 6.3 official announcement and xAI's presentation at the AMD Advancing AI Conference 2024.

Blog

LMSYS Org

[2024-12-04] SGLang v0.4: Zero-Overhead Batch Scheduler, Cache-Aware Load Balancer, Faster Structured Outputs

[2024-09-04] SGLang v0.3 Release: 7x Faster DeepSeek MLA, 1.5x Faster torch.compile, Multi-Image/Video LLaVA-OneVision

[2024-07-25] Achieving Faster Open-Source Llama3 Serving with SGLang Runtime (vs. TensorRT-LLM, vLLM)

[2024-02-05] Fast JSON Decoding for Local LLMs with Compressed Finite State Machine

[2024-01-17] Fast and Expressive LLM Inference with RadixAttention and SGLang

AMD

[2024-11-13] SGLang: Fast Serving Framework for Large Language and Vision-Language Models on AMD GPUs

PyTorch

[2025-01-21] Accelerating LLM Inference with GemLite, TorchAO and SGLang

Slides

Hyperbolic in-person meetup

[2025-01-15] Efficient LLM Inference with SGLang

[2025-01-15] Cache-Aware Load Balancer in SGLang

[2025-01-15] SGLang DeepSeek Model Optimizations

CAMEL-AI Hackathon: Mastering Multi-Agent Systems

[2024-12-21] SGLang v0.4 Optimization

GPU MODE

[2024-11-10] SGLang Performance Optimization

The first LMSYS online meetup: Efficient LLM Deployment and Serving

[2024-10-16] SGLang Overview & CPU Overhead Hiding

[2024-10-16] Faster Constrained Decoding

[2024-10-16] SGLang DeepSeek MLA

[2024-10-16] Universal LLM deployment and low-latency serving in MLC LLM

[2024-10-16] XGrammar: Flexible And Efficient Structured Generation Engine for Large Language Models

[2024-10-16] Review of the first LMSYS online meetup: Efficient LLM Deployment and Serving

AMD Advancing AI 2024

[2024-10-10] Efficient LLM Inference with SGLang

SGLang Biweekly Meeting

[2025-1-25] A fair and efficient scheduling algorithm

[2024-11-30] Update Weights From Distributed

[2024-11-16] SGLang Router and Side-Channel KV Cache Attack

[2024-11-02] Quantization on AMD

[2024-10-05] SGLang Double Sparsity

[2024-09-21] SGLang DeepSeek MLA

Other

SGLang v0.2: Faster Interface and Runtime for LLM Inference

Videos

Welcome to follow our YouTube channel.

GPU MODE

[2024-11-10] SGLang Performance Optimization

The first LMSYS online meetup

[2024-10-16] The First SGLang Online Meetup

AMD Advancing AI 2024

[2024-10-10] Efficient LLM Inference with SGLang

SGLang Biweekly Meeting

[2025-01-25] SGLang Developer Sync 20250125

[2024-12-28] SGLang Developer Sync 20241228

[2024-12-14] SGLang Developer Sync 20241214

[2024-11-30] SGLang Developer Sync 20241130

[2024-11-16] SGLang Developer Sync 20241116

[2024-11-03] SGLang Developer Sync 20241103

[2024-10-19] SGLang Developer Sync 20241019

[2024-10-05] SGLang Developer Sync 20241005

[2024-09-21] SGLang Developer Sync 20240921

Paper

[NeurIPS 24] SGLang: Efficient Execution of Structured Language Model Programs

Documentaion

SGLang Documentation

About

Materials for learning SGLang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published