[Question] A WebGPU-accelerated ONNX inference run-time #119

ansarizafar · 2023-05-21T06:11:20Z

Is it possible to use https://github.com/webonnx/wonnx with transformersjs?

xenova · 2023-05-21T07:15:16Z

At the moment, we are currently testing with the official onnxruntime-web implementation (https://github.com/microsoft/onnxruntime), which is still a work in progress.

So, wonnx is not currently supported, but if the interface is similar to onnxruntime-web, it would probably be easy to fork this project and use it instead. If someone would like to try, that would be awesome! I could then maybe add wonnx as a supported backend.

jlia0 · 2023-06-05T02:53:26Z

@xenova I would love to give it a try, I just converted instructor-base to ONNX format using your script, but I am not sure how to use the converted model with Transformer.js, onnxruntime-web or WONNX. For example,

const outputs = await session.run({ input: inputTensor });

How can I use the tokenizer and how do I process the input?

xenova · 2023-06-08T17:04:57Z

@jlia0 That would be awesome if you'd like to look into this!

I'd recommend looking at how we lay out the various pipeline functions (https://github.com/xenova/transformers.js/blob/main/src/pipelines.js) to help you figure out what inputs to provide the models.

In the simplest case (e.g, for BERT), you'll just need to tokenize the inputs and then pass this to the model. Your code might look something like this:

import { AutoTokenizer } from '@xenova/transformers';

let tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
let inputs = await tokenizer('I love transformers!');
// {
//   input_ids: Tensor {
//     data: BigInt64Array(6) [101n, 1045n, 2293n, 19081n, 999n, 102n],
//     dims: [1, 6],
//     type: 'int64',
//     size: 6,
//   },
//   attention_mask: Tensor {
//     ...
//   }
// }

See https://huggingface.co/docs/transformers.js/api/tokenizers for more information.

You'll then pass inputs into session.run

sandorkonya · 2023-06-12T17:47:36Z

If i'm not mistaken, this question is somewhat related to mine, @jlia0 thank you bringing another possible solution in play.

ansarizafar added the question Further information is requested label May 21, 2023

xenova mentioned this issue Jan 27, 2024

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

Merged

13 tasks

xenova closed this as completed in #545 Oct 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] A WebGPU-accelerated ONNX inference run-time #119

[Question] A WebGPU-accelerated ONNX inference run-time #119

ansarizafar commented May 21, 2023

xenova commented May 21, 2023 •

edited

Loading

jlia0 commented Jun 5, 2023

xenova commented Jun 8, 2023 •

edited

Loading

sandorkonya commented Jun 12, 2023

[Question] A WebGPU-accelerated ONNX inference run-time #119

[Question] A WebGPU-accelerated ONNX inference run-time #119

Comments

ansarizafar commented May 21, 2023

xenova commented May 21, 2023 • edited Loading

jlia0 commented Jun 5, 2023

xenova commented Jun 8, 2023 • edited Loading

sandorkonya commented Jun 12, 2023

xenova commented May 21, 2023 •

edited

Loading

xenova commented Jun 8, 2023 •

edited

Loading