fix: latest colpali engine models results#238
fix: latest colpali engine models results#238isaac-chung merged 1 commit intoembeddings-benchmark:mainfrom
Conversation
Model Results ComparisonReference models: Results for
|
| task_name | nomic-ai/colnomic-embed-multimodal-3b | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.62 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.57 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.49 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.55 | 0.58 |
| VidoreArxivQARetrieval | 0.88 | 0.89 |
| VidoreDocVQARetrieval | 0.61 | 0.66 |
| VidoreInfoVQARetrieval | 0.93 | 0.95 |
| VidoreShiftProjectRetrieval | 0.90 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.96 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.97 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.97 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.98 | 1.00 |
| VidoreTabfquadRetrieval | 0.94 | 0.96 |
| VidoreTatdqaRetrieval | 0.83 | 0.83 |
| Average | 0.80 | 0.84 |
Results for nomic-ai/colnomic-embed-multimodal-7b
| task_name | nomic-ai/colnomic-embed-multimodal-7b | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.63 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.69 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.54 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.56 | 0.58 |
| VidoreArxivQARetrieval | 0.88 | 0.89 |
| VidoreDocVQARetrieval | 0.60 | 0.66 |
| VidoreInfoVQARetrieval | 0.92 | 0.95 |
| VidoreShiftProjectRetrieval | 0.89 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.99 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.96 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.96 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.99 | 1.00 |
| VidoreTabfquadRetrieval | 0.96 | 0.96 |
| VidoreTatdqaRetrieval | 0.81 | 0.83 |
| Average | 0.81 | 0.84 |
Results for vidore/colSmol-256M
| task_name | vidore/colSmol-256M | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.34 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.48 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.31 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.27 | 0.58 |
| VidoreArxivQARetrieval | 0.73 | 0.89 |
| VidoreDocVQARetrieval | 0.57 | 0.66 |
| VidoreInfoVQARetrieval | 0.84 | 0.95 |
| VidoreShiftProjectRetrieval | 0.62 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.96 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.93 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.95 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.96 | 1.00 |
| VidoreTabfquadRetrieval | 0.65 | 0.96 |
| VidoreTatdqaRetrieval | 0.77 | 0.83 |
| Average | 0.67 | 0.84 |
Results for vidore/colSmol-500M
| task_name | vidore/colSmol-500M | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.43 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.52 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.4 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.36 | 0.58 |
| VidoreArxivQARetrieval | 0.75 | 0.89 |
| VidoreDocVQARetrieval | 0.58 | 0.66 |
| VidoreInfoVQARetrieval | 0.87 | 0.95 |
| VidoreShiftProjectRetrieval | 0.67 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.98 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.95 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.95 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.98 | 1.00 |
| VidoreTabfquadRetrieval | 0.75 | 0.96 |
| VidoreTatdqaRetrieval | 0.77 | 0.83 |
| Average | 0.71 | 0.84 |
Results for vidore/colpali-v1.1
| task_name | vidore/colpali-v1.1 | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.51 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.57 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.48 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.44 | 0.58 |
| VidoreArxivQARetrieval | 0.8 | 0.89 |
| VidoreDocVQARetrieval | 0.59 | 0.66 |
| VidoreInfoVQARetrieval | 0.82 | 0.95 |
| VidoreShiftProjectRetrieval | 0.7 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.97 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.92 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.93 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.95 | 1.00 |
| VidoreTabfquadRetrieval | 0.82 | 0.96 |
| VidoreTatdqaRetrieval | 0.66 | 0.83 |
| Average | 0.73 | 0.84 |
Results for vidore/colpali-v1.2
| task_name | vidore/colpali-v1.2 | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.55 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.54 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.52 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.48 | 0.58 |
| VidoreArxivQARetrieval | 0.78 | 0.89 |
| VidoreDocVQARetrieval | 0.57 | 0.66 |
| VidoreInfoVQARetrieval | 0.82 | 0.95 |
| VidoreShiftProjectRetrieval | 0.77 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.98 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.94 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.94 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.95 | 1.00 |
| VidoreTabfquadRetrieval | 0.89 | 0.96 |
| VidoreTatdqaRetrieval | 0.68 | 0.83 |
| Average | 0.74 | 0.84 |
Results for vidore/colpali-v1.3
| task_name | vidore/colpali-v1.3 | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.56 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.59 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.55 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.49 | 0.58 |
| VidoreArxivQARetrieval | 0.83 | 0.89 |
| VidoreDocVQARetrieval | 0.58 | 0.66 |
| VidoreInfoVQARetrieval | 0.86 | 0.95 |
| VidoreShiftProjectRetrieval | 0.77 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.97 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.95 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.96 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.97 | 1.00 |
| VidoreTabfquadRetrieval | 0.87 | 0.96 |
| VidoreTatdqaRetrieval | 0.71 | 0.83 |
| Average | 0.76 | 0.84 |
Results for vidore/colqwen2-v1.0
| task_name | vidore/colqwen2-v1.0 | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.56 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.6 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.54 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.53 | 0.58 |
| VidoreArxivQARetrieval | 0.88 | 0.89 |
| VidoreDocVQARetrieval | 0.61 | 0.66 |
| VidoreInfoVQARetrieval | 0.93 | 0.95 |
| VidoreShiftProjectRetrieval | 0.9 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 0.99 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.96 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.95 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.99 | 1.00 |
| VidoreTabfquadRetrieval | 0.89 | 0.96 |
| VidoreTatdqaRetrieval | 0.82 | 0.83 |
| Average | 0.8 | 0.84 |
Results for vidore/colqwen2.5-v0.2
| task_name | vidore/colqwen2.5-v0.2 | Max result |
|---|---|---|
| Vidore2BioMedicalLecturesRetrieval | 0.61 | 0.63 |
| Vidore2ESGReportsHLRetrieval | 0.66 | 0.76 |
| Vidore2ESGReportsRetrieval | 0.56 | 0.57 |
| Vidore2EconomicsReportsRetrieval | 0.57 | 0.58 |
| VidoreArxivQARetrieval | 0.89 | 0.89 |
| VidoreDocVQARetrieval | 0.64 | 0.66 |
| VidoreInfoVQARetrieval | 0.93 | 0.95 |
| VidoreShiftProjectRetrieval | 0.88 | 0.93 |
| VidoreSyntheticDocQAAIRetrieval | 1.00 | 1.00 |
| VidoreSyntheticDocQAEnergyRetrieval | 0.96 | 0.97 |
| VidoreSyntheticDocQAGovernmentReportsRetrieval | 0.96 | 0.98 |
| VidoreSyntheticDocQAHealthcareIndustryRetrieval | 0.98 | 1.00 |
| VidoreTabfquadRetrieval | 0.91 | 0.96 |
| VidoreTatdqaRetrieval | 0.82 | 0.83 |
| Average | 0.81 | 0.84 |
|
@paultltc Thanks for the updates! |
updates colpali engine models results after latest changes.
@isaac-chung