- π I am a software engineer on the Model Performance Team at Baseten. I used to work at Meituan, using Tensorflow, TensorRT for CTR GPU Inference and PyTorch for LLM GPU Inference.
- π» Open Source: Team Member at LMSYS Org, working on SGLang, and a committer for both FlashInfer and LMDeploy.
- π If you're interested in learning more about my experiences and SGLang, I recommend checking out my talk about SGLang at GPU MODE.
- π« How to reach me: [email protected] or Telegram
- π Learn more about my work experience: Linkedin
zhyncs
Follow
π―
Pinned Loading
-
sgl-project/sglang
sgl-project/sglang PublicSGLang is a fast serving framework for large language models and vision language models.
-
flashinfer-ai/flashinfer
flashinfer-ai/flashinfer PublicFlashInfer: Kernel Library for LLM Serving
-
InternLM/lmdeploy
InternLM/lmdeploy PublicLMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.