kentang-mit

Follow

🎯

Focusing

Haotian (Ken) Tang kentang-mit

🎯

Focusing

Follow

Research Scientist at Google DeepMind

413 followers · 35 following

Cambridge, Massachusetts, United States
http://kentang.net

Achievements

Achievements

Stars

causalfusion / causalfusion

Python 138 3 Updated Dec 17, 2024

NVIDIA / Cosmos-Tokenizer

A suite of image and video neural tokenizers

Jupyter Notebook 1,557 68 Updated Feb 11, 2025

TurakhiaLab / DP-HLS

HLS-based framework to accelerate the implementation of 2-D DP kernels on FPGA

C++ 7 1 Updated Dec 29, 2024

mit-han-lab / vila-u

[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Python 224 5 Updated Jan 22, 2025

NVlabs / Sana

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 3,379 203 Updated Feb 12, 2025

mit-han-lab / hart

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Python 417 20 Updated Oct 16, 2024

mit-han-lab / Block-Sparse-Attention

A sparse attention kernel supporting mix sparse patterns

C++ 111 3 Updated Feb 13, 2025

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,281 69 Updated Sep 27, 2024

Kwai-Kolors / MPS

Python 142 7 Updated Jul 12, 2024

PingchuanMa / SGA

[ICML 2024] LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

Python 63 8 Updated May 31, 2024

gcorso / disco-diffdock

Code for the paper DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents, ICML 2024

Python 80 2 Updated Jun 12, 2024

mit-han-lab / Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 244 24 Updated Nov 22, 2024

TencentARC / SEED-Voken

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 830 32 Updated Feb 14, 2025

OpenGVLab / LCL

[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning

Python 68 4 Updated Feb 11, 2025

OpenGVLab / OmniCorpus

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Python 306 7 Updated Nov 17, 2024

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,571 70 Updated Aug 15, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,039 113 Updated Feb 18, 2025

tianweiy / DMD2

(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis

Python 653 34 Updated Jan 21, 2025

jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 561 30 Updated Oct 6, 2024

jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Python 1,497 152 Updated Oct 28, 2024

mit-han-lab / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 327 26 Updated Feb 14, 2025

mit-han-lab / qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 496 31 Updated Feb 16, 2025

microsoft / microxcaling

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 199 30 Updated Sep 23, 2024

mshumer / ai-researcher

Jupyter Notebook 917 102 Updated Apr 29, 2024

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,590 431 Updated Jan 12, 2025

spcl / QuaRot

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 339 30 Updated Nov 26, 2024

dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,241 281 Updated May 4, 2024

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 2,906 234 Updated Feb 10, 2025

microsoft / msccl

Microsoft Collective Communication Library

C++ 332 31 Updated Sep 20, 2023

dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Python 768 45 Updated Jul 29, 2024