Yolo examples updates #13

cavusmustafa · 2025-10-15T05:38:45Z

Some updates in CMakeLists and requirements to resolve build/export issues with latest version
New pipeline which reduces idle time for each stage and improves overall throughput. While the NPU or GPU is busy performing inference on one frame, the CPU can simultaneously preprocess the next frame and postprocess the previous one.

	FPS
XNNPACK	3.5
CPU FP32	6.9
CPU INT8	13.8
GPU FP16	52.3
NPU FP16	64.5

CPU: Intel(R) Core(TM) Ultra 5 238V
Model: Yolo12s
Model input size: 640x640

daniil-lyakhov · 2025-10-15T12:26:49Z

examples/models/yolo12/CMakeLists.txt

+find_package(absl CONFIG REQUIRED PATHS ${EXECUTORCH_ROOT}/cmake-out)
+find_package(re2 CONFIG REQUIRED PATHS ${EXECUTORCH_ROOT}/cmake-out)
+find_package(tokenizers CONFIG REQUIRED PATHS ${EXECUTORCH_ROOT}/cmake-out)


Why do we need tokenizers and other dependencies for YOLO?

Are you able to build it without these? The yolo example doesn't use them but I thought some dependencies need them. I will check again if we can build without these.

Without these I see the error below:

CMake Error at CMakeLists.txt:37 (find_package): Found package configuration file: /home/mcavus/executorch/executorch/cmake-out/lib/cmake/ExecuTorch/executorch-config.cmake but it set executorch_FOUND to FALSE so package "executorch" is considered to be NOT FOUND. Reason given by package: The following imported targets are referenced, but are missing: tokenizers::tokenizers

Yes, the example was fully functional when it was merged. That's a bit strange, how do you build the example?
You can find a test script over there https://github.com/pytorch/executorch/blob/main/.ci/scripts/test_yolo12.sh

I think meta guys could potentially help with that

The build commands below should work right? I can reproduce with the main branch. I see similar error either with OV backend or XNNPACK by the way. I will ping them in discord.

rm -rf build mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release -DUSE_XNNPACK_BACKEND=OFF -DUSE_OPENVINO_BACKEND=ON .. make -j$(nproc)

daniil-lyakhov · 2025-10-15T12:36:46Z

examples/models/yolo12/main.cpp

+    while (!ready_q.empty() && scale_q.size() < frame_queue_size) {
+      frame_ctx *scale_f = ready_q.front();
+      scale_q.push(std::make_pair(scale_f, std::async(std::launch::async, scale_with_padding, std::ref(scale_f->frame), &(scale_f->pad_x), &(scale_f->pad_y), &(scale_f->scale), img_dims)));
+      ready_q.pop();
    }
-    const et_timestamp_t after_execute = et_pal_current_ticks();
-    time_spent_executing += after_execute - before_execute;
-    iters++;
-
-    if (!(iters % progress_bar_tick)) {
-      const int precent_ready = (100 * iters) / video_lenght;
-      std::cout << iters << " out of " << video_lenght
-                << " frames are are processed (" << precent_ready << "\%)"
-                << std::endl;
+    while (!scale_q.empty() && input_q.size() < frame_queue_size) {
+      auto status = scale_q.front().second.wait_for(std::chrono::milliseconds(1));
+      if (status == std::future_status::ready) {


General questions:

Looks like you are implementing inference request queue, is it possible to utilize the standard openvino API somehow?

This is a real-time demo, the data is streamed sequentially and should be shoved sequentially, how does it work with your updated?

I believe it is unfair to collect only model inference time without pre and post processing and claim it as a FPS stats. In real application the pre and post processing will affect the FPS

In general - could you please state the motives behind this PR? What are the purpose and improvements this PR introducing?

We could try using async call inside openvino backend. This way, we could simply call forward function from executorch application and let openvino schedule the tasks. But I found two issues with it (explained below). But I don't think we need it anyways. The model inference still executes sequentially as we have a mutex lock on that part and we don't need to execute model inference asynchronously for this use case. I explained it more in 2.

We claim to support xnnpack with this application as well. We may need to add a lot of customizations only for openvino in that case.

Single executorch module seems to be using same output buffer for all executions. Upcoming task may override the result of the previous task. This seems to be risky (and it fails for xnnpack). We can create multiple executorch modules but in that case, I don't know if it would share the same openvino backend object. If it doesn't we may not be able to use async execution as intended and we may have additional memory overheads.

We can assume the data is streamed sequentially and shoved sequentially. But still we can use pipelining for preprocess, infer, and postprocess which was the intention in this PR. So, as the first frame completes preprocessing on CPU and starts model execution on GPU (or NPU), we can also start preprocessing for the second frame as long as it is ready. Once the first frame completes GPU process, the second frame task can be assigned to GPU while the first frame start postprocessing.
Also, in a real time stream, it will be better to limit the size of ready queue (maybe 2 or even 1). Larger ready queue size can cause delays in the output video.

I didn't understand this part. The time measurement should be already for end-to-end object detection process (collecting timing before and after the whole while loop). It increases iters only when a frame retires. At the end it calculates the timing based on total while loop timing and total number of frames retired.

Got it, thanks

Yolo examples updates

412f101

daniil-lyakhov reviewed Oct 15, 2025

View reviewed changes

Fix timing calculation

97dbefb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Yolo examples updates #13

Yolo examples updates #13

Uh oh!

cavusmustafa commented Oct 15, 2025 •

edited

Loading

Uh oh!

daniil-lyakhov Oct 15, 2025

Uh oh!

cavusmustafa Oct 15, 2025

Uh oh!

cavusmustafa Oct 15, 2025

Uh oh!

daniil-lyakhov Oct 16, 2025 •

edited

Loading

Uh oh!

cavusmustafa Oct 16, 2025

Uh oh!

daniil-lyakhov Oct 15, 2025 •

edited

Loading

Uh oh!

cavusmustafa Oct 15, 2025 •

edited

Loading

Uh oh!

daniil-lyakhov Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Yolo examples updates #13

Are you sure you want to change the base?

Yolo examples updates #13

Uh oh!

Conversation

cavusmustafa commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daniil-lyakhov Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

cavusmustafa Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

cavusmustafa Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

daniil-lyakhov Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cavusmustafa Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

daniil-lyakhov Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cavusmustafa Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daniil-lyakhov Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cavusmustafa commented Oct 15, 2025 •

edited

Loading

daniil-lyakhov Oct 16, 2025 •

edited

Loading

daniil-lyakhov Oct 15, 2025 •

edited

Loading

cavusmustafa Oct 15, 2025 •

edited

Loading