[Feature Request] Migrate to JSI for React Native #16031

jhen0409 · 2023-05-21T23:55:24Z

Describe the feature request

Migrate to JSI can help to improve performance issues with the current implementation of React Native bridge and base64 encode/decode.

For return type of tensor input and output, we can construct ArrayBuffers from existing memory buffers, but it may be only supported by newer React Native version (>= 0.71?). Another way is implement type like blob-jsi-helper does with memcpy but it may causes some memory issues.

I might try to send a PR later for iOS first.

For Android, since it's using C++, it may require some additional work to access onnxruntime directly. (Calling java class through JNI may become complicated)

Describe scenario use case

Transformers.js is trying to support React Native (huggingface/transformers.js#118 by @hans00), I found that the performance in base64 encode/decode of buffer is not so good. See also this comment.

I also try to use react-native-blob-jsi-helper to replace base64 by a patch for test purpose, see this draft PR for more details.

skottmckay · 2023-05-22T03:30:30Z

We have looked at adding this support but haven't had the resources available to do so. Do you have a production scenario you require this for?

Whilst the base64 encode/decode overhead isn't great, typically model execution is the largest cost for a production model.

I'm struggling to understand how base64 encode/decode could add up to multiple seconds in the comment you linked though. The buffer size can be pre-calculated, and it's a single iteration of the data to encode/decode. Reasonably trivial stuff. We don't do much when decoding either - convert from base64 using buffer and put in a typed array.

onnxruntime/js/react_native/lib/backend.ts

Lines 149 to 151 in 4dc4470

    
           const buffer: Buffer = Buffer.from(results[key].data as string, 'base64'); 
        
           const typedArray = tensorTypeToTypedArray(results[key].type as Tensor.Type); 
        
           tensorData = new typedArray(buffer.buffer, buffer.byteOffset, buffer.length / typedArray.BYTES_PER_ELEMENT);

jhen0409 · 2023-05-22T04:56:20Z

In the case the model run just take ~500ms for each decode session run, but every decode run in JS (decodeReturnType) take ~3.7s with ~5250000 size buffer.

I've plan to use onnxruntime for production in the second half of the year, this may be the current challenge. Will be great if we have resources/maintenance here.

skottmckay · 2023-05-25T07:37:40Z

We'll see if we can get it done in the next release.

jhen0409 · 2023-06-04T22:18:46Z

For #16094 I've been testing for a while, and there are a few things I see that aren't perfect:

Delay on session run with native bridge
- Android: 20 ~ 30ms on Hermes / more on JSC (I reported a related issue here, although it only for V8 but JSC have the same issue)
- iOS: 3 ~ 4ms
resolveArrayBuffer
- Android: 1 ~ 5 ms
- iOS: 0.x ~ 1.x ms

For a task that requires multiple session runs (like whisper), such a delay may very large, it may take a few seconds. The best way still a full JSI migration, but currently the performance of PR #16094 is acceptable for me.

skottmckay · 2023-06-07T22:23:46Z

Can you expand on what 'full JSI migration' would equate to?

I have a very limited knowledge of JSI so I'm not sure what the cost/benefit of that would be. My understanding was the main reason for using JSI with ORT would be to avoid the base64 encode/decode when running the model (given that's typically the performance sensitive operation), and your much appreciated PR seems to cover that.

jhen0409 · 2023-06-08T08:07:43Z

The full JSI migration will need some refactor in my imagination, to rewrite run JSI function with C++ for load & run and consider threading (avoid blocking JS thread) & type convertation for tensor helper.

For iOS, it maybe easy because Objective C++
For Android, options:
1. Access existing Java native module though JNI, but it’s not clean and may a little bit slow
2. Recompile onnxruntime-c to use the API directly in C++, I like it but seems to be some other work to do

It may basically solve the delay of native bridge, especially the Android platform.

The base64 encode/decode is my main concern, the full implementation of JSI may not be urgent.

skottmckay · 2023-06-08T22:51:01Z

Thanks for the info. Is the native bridge delay on every run call or just the first?

jhen0409 · 2023-06-09T02:34:26Z

Thanks for the info. Is the native bridge delay on every run call or just the first?

This is result (use ASR task of this example, and patched #16094) to running whisper-tiny.en for a demo audio file

Monitor inferenceSession.run by performance.now
Monitor run native method by System.currentTimeMillis
Get the diff time

Platform: Android
Device: Pixel 6
React Native version: 0.71
JS engine: Hermes

Full infer time: 378.95 ms, native infer: 378 ms, diff: 0.95 ms
Full infer time: 371.03 ms, native infer: 364 ms, diff: 7.03 ms
Full infer time: 626.60 ms, native infer: 620 ms, diff: 6.60 ms
Full infer time: 640.24 ms, native infer: 632 ms, diff: 8.24 ms
Full infer time: 628.04 ms, native infer: 622 ms, diff: 6.04 ms
Full infer time: 618.83 ms, native infer: 611 ms, diff: 7.83 ms
Full infer time: 626.14 ms, native infer: 620 ms, diff: 6.14 ms
Full infer time: 615.17 ms, native infer: 609 ms, diff: 6.17 ms
Full infer time: 633.09 ms, native infer: 627 ms, diff: 6.09 ms
Full infer time: 630.86 ms, native infer: 625 ms, diff: 5.86 ms
Full infer time: 616.15 ms, native infer: 610 ms, diff: 6.15 ms
Full infer time: 617.30 ms, native infer: 611 ms, diff: 6.30 ms
Full infer time: 636.89 ms, native infer: 625 ms, diff: 11.89 ms
Full infer time: 630.92 ms, native infer: 619 ms, diff: 11.92 ms
Full infer time: 676.55 ms, native infer: 634 ms, diff: 42.55 ms
Full infer time: 647.41 ms, native infer: 637 ms, diff: 10.41 ms
Full infer time: 628.70 ms, native infer: 618 ms, diff: 10.70 ms
Full infer time: 695.45 ms, native infer: 633 ms, diff: 62.45 ms
Full infer time: 644.94 ms, native infer: 633 ms, diff: 11.94 ms
Full infer time: 621.50 ms, native infer: 610 ms, diff: 11.50 ms
Full infer time: 654.28 ms, native infer: 643 ms, diff: 11.28 ms
Full infer time: 650.18 ms, native infer: 639 ms, diff: 11.18 ms
Full infer time: 630.11 ms, native infer: 624 ms, diff: 6.11 ms
Full infer time: 636.94 ms, native infer: 630 ms, diff: 6.94 ms
Full infer time: 648.47 ms, native infer: 641 ms, diff: 7.47 ms
Full infer time: 651.05 ms, native infer: 645 ms, diff: 6.05 ms
Full infer time: 648.11 ms, native infer: 641 ms, diff: 7.11 ms

Not sure if it might caused by some scheduling issue, so it could have a delay up to 60 ms.

The difference may be less noticeable by use MNIST in the e2e project.

@skottmckay

### Description  - Create `OnnxruntimeJSIHelper` native module to provide two JSI functions - `jsiOnnxruntimeStoreArrayBuffer`: Store buffer in Blob Manager & return blob object (iOS: RCTBlobManager, Android: BlobModule) - `jsiOnnxruntimeResolveArrayBuffer`: Use blob object to get buffer - The part of implementation is reference to [react-native-blob-jsi-helper](https://github.com/mrousavy/react-native-blob-jsi-helper) - Replace base64 encode/decode - `loadModelFromBlob`: Rename from `loadModelFromBase64EncodedBuffer` - `run`: Use blob object to replace input.data & results[].data For [this context](#16031 (comment)), it saved a lot of time and avoid JS thread blocking in decode return type, it is 3700ms -> 5~20ms for the case. (resolve function only takes 0.x ms) ### Motivation and Context  It’s related to #16031, but not a full implementation for migrate to JSI. It just uses JSI through BlobManager to replace the slow part (base64 encode / decode). Rewriting it entirely in JSI could be complicated, like type convertion and threading. This PR might be considered a minor change. /cc @skottmckay

jhen0409 added the feature request request for unsupported feature or enhancement label May 21, 2023

github-actions bot added api:Java issues related to the Java API platform:mobile issues related to ONNX Runtime mobile; typically submitted using template labels May 21, 2023

jhen0409 mentioned this issue May 25, 2023

[js/rn] Implement blob exchange by JSI instead of use base64 #16094

Merged

jhen0409 mentioned this issue Jul 12, 2023

[js/rn] Supoort New Architecture #16669

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Migrate to JSI for React Native #16031

[Feature Request] Migrate to JSI for React Native #16031

jhen0409 commented May 21, 2023 •

edited

Loading

skottmckay commented May 22, 2023

jhen0409 commented May 22, 2023

skottmckay commented May 25, 2023

jhen0409 commented Jun 4, 2023 •

edited

Loading

skottmckay commented Jun 7, 2023

jhen0409 commented Jun 8, 2023

skottmckay commented Jun 8, 2023

jhen0409 commented Jun 9, 2023

[Feature Request] Migrate to JSI for React Native #16031

[Feature Request] Migrate to JSI for React Native #16031

Comments

jhen0409 commented May 21, 2023 • edited Loading

Describe the feature request

Describe scenario use case

skottmckay commented May 22, 2023

jhen0409 commented May 22, 2023

skottmckay commented May 25, 2023

jhen0409 commented Jun 4, 2023 • edited Loading

skottmckay commented Jun 7, 2023

jhen0409 commented Jun 8, 2023

skottmckay commented Jun 8, 2023

jhen0409 commented Jun 9, 2023

jhen0409 commented May 21, 2023 •

edited

Loading

jhen0409 commented Jun 4, 2023 •

edited

Loading