Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] A WebGPU-accelerated ONNX inference run-time #119

Closed
ansarizafar opened this issue May 21, 2023 · 4 comments · Fixed by #545
Closed

[Question] A WebGPU-accelerated ONNX inference run-time #119

ansarizafar opened this issue May 21, 2023 · 4 comments · Fixed by #545
Labels
question Further information is requested

Comments

@ansarizafar
Copy link

Is it possible to use https://github.com/webonnx/wonnx with transformersjs?

@ansarizafar ansarizafar added the question Further information is requested label May 21, 2023
@xenova
Copy link
Collaborator

xenova commented May 21, 2023

At the moment, we are currently testing with the official onnxruntime-web implementation (https://github.com/microsoft/onnxruntime), which is still a work in progress.

So, wonnx is not currently supported, but if the interface is similar to onnxruntime-web, it would probably be easy to fork this project and use it instead. If someone would like to try, that would be awesome! I could then maybe add wonnx as a supported backend.

@jlia0
Copy link

jlia0 commented Jun 5, 2023

@xenova I would love to give it a try, I just converted instructor-base to ONNX format using your script, but I am not sure how to use the converted model with Transformer.js, onnxruntime-web or WONNX. For example,

const outputs = await session.run({ input: inputTensor });

How can I use the tokenizer and how do I process the input?

@xenova
Copy link
Collaborator

xenova commented Jun 8, 2023

@jlia0 That would be awesome if you'd like to look into this!

I'd recommend looking at how we lay out the various pipeline functions (https://github.com/xenova/transformers.js/blob/main/src/pipelines.js) to help you figure out what inputs to provide the models.

In the simplest case (e.g, for BERT), you'll just need to tokenize the inputs and then pass this to the model. Your code might look something like this:

import { AutoTokenizer } from '@xenova/transformers';

let tokenizer = await AutoTokenizer.from_pretrained('bert-base-uncased');
let inputs = await tokenizer('I love transformers!');
// {
//   input_ids: Tensor {
//     data: BigInt64Array(6) [101n, 1045n, 2293n, 19081n, 999n, 102n],
//     dims: [1, 6],
//     type: 'int64',
//     size: 6,
//   },
//   attention_mask: Tensor {
//     ...
//   }
// }

See https://huggingface.co/docs/transformers.js/api/tokenizers for more information.

You'll then pass inputs into session.run

@sandorkonya
Copy link

If i'm not mistaken, this question is somewhat related to mine, @jlia0 thank you bringing another possible solution in play.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants