Try OV-DINO, a more powerful open-vocabulary detector. #172

wanghao9610 · 2024-07-30T04:47:57Z

Thanks for the awesome GLIP, I share our recent work 🦖OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.

OV-DINO is a novel unified open vocabulary detection approach that offers superior performance and effectiveness for practical real-world application.
OV-DINO entails a Unified Data Integration pipeline that integrates diverse data sources for end-to-end pre-training, and a Language-Aware Selective Fusion module to improve the vision-language understanding of the model.
OV-DINO shows significant performance improvement on COCO and LVIS benchmarks compared to previous methods, achieving relative improvements of +4.3% AP on COCO and +14.1% AP on LVIS compared to GLIP in zero-shot evaluation.

We have released the evaluation, fine-tuning, demo code in our project, feel free to try our model for your application.

Project: https://wanghao9610.github.io/OV-DINO

Paper: https://arxiv.org/abs/2407.07844

Code: https://github.com/wanghao9610/OV-DINO

Demo: http://47.115.200.157:7860/

Welcome everyone to try our model and feel free to raise issue if you encounter any problem.

crazness · 2024-08-01T16:03:54Z

How much data did you use to train the model, and what is the number of parameters？

wanghao9610 · 2024-08-02T02:35:40Z

@crazness OV-DINO is pre-trained on diverse data sources within a unified framework, including O365, GoldG, CC1M‡ datasets. O365 and GoldG datasets are same with GLIP, CC1M‡ is only 1M image-text pairs much less than Cap4M / Cap24M in GLIP, but OV-DINO achieve better performance. And OV-DINO has 166M paprameters, while GLIP is 232M paprameters. You could find more detail in our Paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try OV-DINO, a more powerful open-vocabulary detector. #172

Try OV-DINO, a more powerful open-vocabulary detector. #172

wanghao9610 commented Jul 30, 2024 •

edited

Loading

crazness commented Aug 1, 2024

wanghao9610 commented Aug 2, 2024

Try OV-DINO, a more powerful open-vocabulary detector. #172

Try OV-DINO, a more powerful open-vocabulary detector. #172

Comments

wanghao9610 commented Jul 30, 2024 • edited Loading

crazness commented Aug 1, 2024

wanghao9610 commented Aug 2, 2024

wanghao9610 commented Jul 30, 2024 •

edited

Loading