Skip to content
View kentang-mit's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report kentang-mit

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 138 3 Updated Dec 17, 2024

A suite of image and video neural tokenizers

Jupyter Notebook 1,557 68 Updated Feb 11, 2025

HLS-based framework to accelerate the implementation of 2-D DP kernels on FPGA

C++ 7 1 Updated Dec 29, 2024

[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 224 5 Updated Jan 22, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 3,379 203 Updated Feb 12, 2025

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 417 20 Updated Oct 16, 2024

A sparse attention kernel supporting mix sparse patterns

C++ 111 3 Updated Feb 13, 2025

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,281 69 Updated Sep 27, 2024
Python 142 7 Updated Jul 12, 2024

[ICML 2024] LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Python 63 8 Updated May 31, 2024

Code for the paper DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents, ICML 2024

Python 80 2 Updated Jun 12, 2024

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 244 24 Updated Nov 22, 2024

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 829 32 Updated Feb 14, 2025

[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

Python 68 4 Updated Feb 11, 2025

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 306 7 Updated Nov 17, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,571 70 Updated Aug 15, 2024

Tile primitives for speedy kernels

Cuda 2,039 113 Updated Feb 18, 2025

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 653 34 Updated Jan 21, 2025

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 561 30 Updated Oct 6, 2024

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,497 152 Updated Oct 28, 2024

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 327 26 Updated Feb 14, 2025

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 496 31 Updated Feb 16, 2025

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 198 30 Updated Sep 23, 2024
Jupyter Notebook 917 102 Updated Apr 29, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,589 431 Updated Jan 12, 2025

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 339 30 Updated Nov 26, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,241 281 Updated May 4, 2024

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 2,905 234 Updated Feb 10, 2025

Microsoft Collective Communication Library

C++ 332 31 Updated Sep 20, 2023

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Python 768 45 Updated Jul 29, 2024
Next
Showing results