🎯
#pragma unroll
- Guangzhou, China
-
04:44
(UTC +08:00) - https://github.com/xlite-dev
Pinned Loading
-
xlite-dev/LeetCUDA
xlite-dev/LeetCUDA Public📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
-
xlite-dev/lite.ai.toolkit
xlite-dev/lite.ai.toolkit Public🛠A lite C++ AI toolkit: 100+ models with MNN, ORT and TRT, including Det, Seg, Stable-Diffusion, Face-Fusion, etc.🎉
-
xlite-dev/Awesome-LLM-Inference
xlite-dev/Awesome-LLM-Inference Public📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
vipshop/cache-dit
vipshop/cache-dit PublicA Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗DiTs.
-
xlite-dev/ffpa-attn
xlite-dev/ffpa-attn Public🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.




