OpenMMLab Detection Toolbox and Benchmark
-
Updated
Aug 21, 2024 - Python
OpenMMLab Detection Toolbox and Benchmark
pix2tex: Using a ViT to convert images of equations into LaTeX code.
This repository contains demos I made with the Transformers library by HuggingFace.
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
SwinIR: Image Restoration Using Swin Transformer (official repository)
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
OpenMMLab Pre-training Toolbox and Benchmark
Efficient vision foundation models for high-resolution generation and perception.
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
EVA Series: Visual Representation Fantasies from BAAI
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
An all-in-one toolkit for computer vision
This is a collection of our NAS and Vision Transformer work.
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
Get clean data from tricky documents, powered by vision-language models ⚡
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Add a description, image, and links to the vision-transformer topic page so that developers can more easily learn about it.
To associate your repository with the vision-transformer topic, visit your repo's landing page and select "manage topics."