Add support for CUDA and CPU arch for Qwen-2.5-VL and Fara-7B by apsonawane · Pull Request #1919 · microsoft/onnxruntime-genai

apsonawane · 2025-12-12T22:52:18Z

Add CUDA and CPU architecture support for Qwen-2.5-VL and Fara-7B model
Validated NPU model is also working with this change

{"accuracy": 0.8765493306891423,"task_name": "ScienceQA_Visual"}
{"accuracy": 0.8244818652849741, "task_name": "ai2d_test"}
{"accuracy": 0.8108, "task_name": "chart_qa_test"}
{"accuracy": 0.4825291181364393, "task_name": "intergps_test"}

Add CUDA and CPU architecture support for Qwen-2.5-VL and Fara-7B model Validated NPU model is also working with this change

Add support for CUDA and CPU arch for Qwen-2.5-VL and Fara-7B

145ae29

tianleiwu mentioned this pull request Dec 12, 2025

Add processor for QWen 2.5 VL #1891

Closed

apsonawane added 2 commits December 15, 2025 19:08

Fix NPU architecture

e7b40d4

Fix warnings

413b511

apsonawane force-pushed the asonawane/vlm branch from 6c79e51 to 413b511 Compare December 15, 2025 21:44

Fix C++-20 issue and add accuracy and unit tests

02c6e7d

apsonawane requested a review from tianleiwu December 16, 2025 00:22

apsonawane added 3 commits December 16, 2025 00:50

Fix pipeline

ccbdc7f

Fix pipeline

d31fca3

Fixing

bef820c

apsonawane force-pushed the asonawane/vlm branch from a57f15b to bef820c Compare December 16, 2025 19:10

apsonawane enabled auto-merge (squash) December 16, 2025 22:08

tianleiwu reviewed Dec 16, 2025

View reviewed changes

Comment thread src/models/model.cpp

tianleiwu reviewed Dec 16, 2025

View reviewed changes

Comment thread test/python/test_qwen_fara_models.py

tianleiwu reviewed Dec 17, 2025

View reviewed changes

Comment thread test/python/test_qwen_fara_models.py

tianleiwu approved these changes Dec 17, 2025

View reviewed changes

apsonawane merged commit f41b3cc into main Dec 17, 2025
15 checks passed

apsonawane deleted the asonawane/vlm branch December 17, 2025 21:08

apsonawane mentioned this pull request Dec 17, 2025

Remove unused unit tests #1923

Closed

kunal-vaishnavi added the 0.11.5 label Dec 18, 2025

apsonawane added a commit that referenced this pull request Dec 19, 2025

Add support for CUDA and CPU arch for Qwen-2.5-VL and Fara-7B (#1919)

04543f0

Add CUDA and CPU architecture support for Qwen-2.5-VL and Fara-7B model Validated NPU model is also working with this change

dependabot Bot mentioned this pull request Feb 16, 2026

Bump Microsoft.ML.OnnxRuntimeGenAI from 0.11.4 to 0.12.0 yuniko-software/qwen3-onnx#23

Closed

dependabot Bot mentioned this pull request Mar 2, 2026

Bump Microsoft.ML.OnnxRuntimeGenAI from 0.11.4 to 0.12.1 yuniko-software/qwen3-onnx#27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for CUDA and CPU arch for Qwen-2.5-VL and Fara-7B#1919

Add support for CUDA and CPU arch for Qwen-2.5-VL and Fara-7B#1919
apsonawane merged 7 commits into
mainfrom
asonawane/vlm

apsonawane commented Dec 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

apsonawane commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

apsonawane commented Dec 12, 2025 •

edited

Loading