vit

Here are 111 public repositories matching this topic...

lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

python machine-learning ocr latex deep-learning image-processing pytorch dataset transformer vit image2text im2text im2latex im2markup math-ocr vision-transformer latex-ocr

Updated Jul 5, 2024
Python

towhee-io / towhee

Star

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

machine-learning computer-vision pipeline image-processing embeddings transformer video-processing feature-extraction convolutional-networks vit feature-vector image-retrieval unstructured-data embedding-vectors milvus vision-transformer towhee llm

Updated Jan 20, 2024
Python

BR-IDL / PaddleViT

Star

🤖 PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

computer-vision deep-learning detection cv transformer gan classification segmentation object-detection mlp vit semantic-segmentation encoder-decoder paddlepaddle

Updated Sep 7, 2022
Python

roboflow / inference

Star

A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

Updated Jul 5, 2024
Python

sail-sg / Adan

Star

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

Updated Jul 2, 2024
Python

open-compass / VLMEvalKit

Star

Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

Updated Jul 7, 2024
Python

chinhsuanwu / mobilevit-pytorch

Star

A PyTorch implementation of "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer"

vit mobilenetv2 vision-transformer mobilevit

Updated Jan 16, 2022
Python

v-iashin / video_features

Star

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Updated May 2, 2024
Python

gupta-abhay / pytorch-vit

Star

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

transformers image-classification image-recognition vit vision-transformer hybrid-vit

Updated Oct 1, 2021
Python

PaddlePaddle / PASSL

Star

PASSL包含 SimCLR，MoCo v1/v2，BYOL，CLIP，PixPro，simsiam, SwAV, BEiT，MAE 等图像自监督算法以及 Vision Transformer，DEiT，Swin Transformer，CvT，T2T-ViT，MLP-Mixer，XCiT，ConvNeXt，PVTv2 等基础视觉算法

deep-learning vit clip paddle pvt mae moco self-supervised-learning cvt simclr beit vision-transformer deit pixpro moco-v2 swav swin-transformer xcit convnext

Updated Aug 1, 2023
Python

megvii-research / RevCol

Star

Official Code of Paper "Reversible Column Networks" "RevColv2"

computer-vision cnn pytorch transformer vit mae iclr2023

Updated Sep 6, 2023
Python

i. A practical application of Transformer (ViT) on 2-D physiological signal (EEG) classification tasks. Also could be tried with EMG, EOG, ECG, etc. ii. Including the attention of spatial dimension (channel attention) and *temporal dimension*. iii. Common spatial pattern (CSP), an efficient feature enhancement method, realized with Python.

deep-learning eeg transformer attention vit attention-mechanism physiological-signals common-spatial-pattern eeg-classification

Updated Mar 23, 2023
Python

implus / mae_segmentation

Star

reproduction of semantic segmentation using masked autoencoder (mae)

vit semantic-segmentation mae self-supervised-learning masked-autoencoder vision-transformer

Updated Feb 3, 2022
Python

yaoxiaoyuan / mimix

Star

Mimix: A Text Generation Tool and Pretrained Chinese Models

Updated May 30, 2024
Python

kyegomez / NaViT

Sponsor

Star

My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"

vit attention-mechanism clip multimodality multimodal-learning multimodal multimodal-deep-learning gpt4

Updated Jun 17, 2024
Python

PaddlePaddle / PLSC

Star

Paddle Large Scale Classification Tools，supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.