🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

xenova · 2024-01-27T17:53:18Z

In preparation for Transformers.js v3, I'm compiling a list of issues/features which will be fixed/included in the release.

Useful commands:

Pack
```
npm pack
```
Publish dry-run
```
npm publish --dry-run
```
Publish dry-run w/ tag
```
npm publish --dry-run --tag dev
```

Bump alpha version

npm version prerelease --preid=alpha -m "[version] Update to %s"

How to use WebGPU

First, install the development branch

npm install @huggingface/transformers

Then specify the device parameter when loading the model. Here's example code to get started. Please note that this is still a WORK IN PROGRESS, so the following usage may change before release.

import { pipeline } from '@huggingface/transformers';

// Create feature extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2', {
    device: 'webgpu',
    dtype: 'fp32', // or 'fp16'
});

// Generate embeddings
const sentences = ['That is a happy person', 'That is a very happy person'];
const output = await extractor(sentences, { pooling: 'mean', normalize: true });
console.log(output.tolist());

HuggingFaceDocBuilderDev · 2024-01-27T17:56:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Huguet57 · 2024-01-31T00:34:02Z

Hey! This is great. Is this already in alpha?

kishorekaruppusamy · 2024-02-06T11:10:57Z

Team, is there any tentative time to release this v3 alpha ???

jhpassion0621 · 2024-02-12T17:24:01Z

I can't wait anymore :) Please update me when it will be released!

NawarA · 2024-03-06T22:37:15Z

@xenova it looks like #596 is part of this release?! I think that means onnx_data files will be supported?

If true, I'm stoked!

Beyond upgrading ort to 1.17, are there other changes needed to support models with onnx_data files? Happy to try to lend a hand if possible

xenova · 2024-03-09T01:10:55Z

Hi everyone! Today we released our first WebGPU x Transformers.js demo: The WebGPU Embedding Benchmark (online demo). If you'd like to help with testing, please run the benchmark and share your results! Thanks!

khmyznikov · 2024-03-11T18:17:40Z

@xenova can this bench pick the GPU 1 instead of 0? For the laptops with dGPU

xenova · 2024-03-11T22:16:33Z

@xenova can this bench pick the GPU 1 instead of 0? For the laptops with dGPU

Not currently, but this is being worked on here: microsoft/onnxruntime#19857. We will add support here once ready.

examples/webgpu-embedding-benchmark/main.js

xenova · 2024-03-13T18:14:18Z

@beaufortfrancois - I've added the source code for the video background removal demo. On my device, I get ~20fps w/ WebGPU support (w/ fp32 since fp16 is broken). Here's a screen recording (which drops my fps to ~14):

webgpu-modnet.mp4

Model used: https://huggingface.co/Xenova/modnet (~4 years old, and it clearly struggles on hands moving quickly). I will try on more up-to-date models soon.
Video tested: https://www.youtube.com/watch?v=NXpdyAWLDas
Online demo: https://huggingface.co/spaces/Xenova/webgpu-video-background-removal

beaufortfrancois · 2024-03-14T13:34:06Z

@beaufortfrancois - I've added the source code for the video background removal demo. On my device, I get ~20fps w/ WebGPU support (w/ fp32 since fp16 is broken). Here's a screen recording (which drops my fps to ~14):

You rock. Thanks! It's a cool demo! 👍

I've been wondering how we could improve it:

I've noticed you read the current frame of the video on the main thread. Would it help to move the entire demo to a web worker?
output[0].mul(255).to('uint8') takes some non negligible time to run. Is there a faster path?
How much you expect fp16 to improve perf? In https://developer.chrome.com/blog/new-in-webgpu-120#support_for_16-bit_floating-point_values_in_wgsl, we've noticed on an Apple M1 Pro device that the f16 implementation of Llama2 7B models used in the WebLLM chat demo is significantly faster than the f32 implementation, with a 28% improvement in prefill speed and a 41% improvement in decoding speed.
A way to feed a GPUExternalTexture to the model as an input could also come handy.

beaufortfrancois · 2024-03-14T14:27:15Z

src/utils/devices.js

@@ -0,0 +1,3 @@
+/**
+ * @typedef {'cpu'|'gpu'|'wasm'|'webgpu'|null} DeviceType


Out of curiosity, what is 'gpu'?

It's meant to be a "catch-all" for the different ways that the library can be used with GPU support (not just in the browser with WebGPU). The idea is that it will simplify documentation, as transformers.js will select the best execution provider depending on the environment. For example, DML/CUDA support in onnxruntime-node (see microsoft/onnxruntime#16050 (comment))

Of course, this is still a work in progress, so it can definitely change!

Co-authored-by: Joshua Lochner <[email protected]>

do-me · 2024-10-18T13:33:45Z

Let's gooo 🚀 🚀 🚀 Awesome work!!!

kungfooman · 2024-10-18T13:43:49Z

I nearly thought it would never happen 🙈 An amazing achievement and thank you for your persistence!

young-developer · 2024-10-18T14:24:35Z

🔥 🚀

flatsiedatsie · 2024-10-18T15:08:26Z

WOOHOO!!! Congrats!! WebGPU all the things!

gyagp · 2024-10-18T15:51:45Z

This is a huge milestone🎉 Thank you for all the fantastic work in this great project!

okasi · 2024-10-19T07:08:02Z

🚀 🚀 🚀

justin0mcateer · 2024-10-19T11:03:25Z

👍🥳

This was linked to issues Jan 27, 2024

[Feature request] WebGPU support #73

Closed

[Feature request] Deno Support #78

Closed

YOLOS model extremely slow #533

Closed

xenova marked this pull request as draft January 27, 2024 18:02

xenova mentioned this pull request Feb 3, 2024

Compatibility with the latest onnxruntime 1.17.0 #560

Open

This comment was marked as outdated.

Sign in to view

beaufortfrancois mentioned this pull request Mar 11, 2024

[js/web] WebGPU backend via JSEP microsoft/onnxruntime#14579

Merged

This was referenced Mar 11, 2024

Update package.json to latest sharp #640

Closed

Update sharp.js version (+ add Deno NPM support) #463

Closed

xenova mentioned this pull request Mar 11, 2024

Update onnxruntime to recent version (works with bun) #489

Closed

felladrin reviewed Mar 12, 2024

View reviewed changes

examples/webgpu-embedding-benchmark/main.js Outdated Show resolved Hide resolved

xenova linked an issue Mar 12, 2024 that may be closed by this pull request

Does WEBGPU Truly Enhance Inference Time Acceleration? #586

Closed

beaufortfrancois reviewed Mar 14, 2024

View reviewed changes

This was referenced Mar 14, 2024

Errors uploading PDF files jacoblee93/fully-local-pdf-chatbot#6

Closed

Library no longer maintained? #646

Closed

Segmentation fault (core dumped) when using Transformers.js oven-sh/bun#4619

Open

xenova and others added 6 commits October 9, 2024 23:10

Merge branch 'main' into v3

a5e0210

Optimise loop to reduce calls to this

9f8fac0

Co-authored-by: Joshua Lochner <[email protected]>

Merge branch 'pr/966' into v3

1c43e3f

Add back tensor map test

7a0f77c

Add support for granite models

da03a0a

Allow multiple optional configs to be passed (+ reduce code duplication)

37effa3

xenova mentioned this pull request Oct 14, 2024

Error: Unknown Feature Extractor type: PyAnnoteFeatureExtractor #974

Open

5 tasks

xenova added 9 commits October 14, 2024 22:01

Bump dependencies

f21b36e

Bump versions

d26a663

[version] Update to 3.0.0-alpha.21

c337c3b

Add support for per-dtype kv_cache_dtype

92d0dc6

Add text streamer unit test

ea03bf5

Bump ORT web version

27a033f

Bump versions

19277ea

[version] Update to 3.0.0-alpha.22

90a7490

Update repo name to @huggingface/transformers.js

38773ea

xenova changed the title ~~[WIP] 🚀🚀🚀 Transformers.js V3 🚀🚀🚀~~ 🚀🚀🚀 Transformers.js V3 🚀🚀🚀 Oct 18, 2024

xenova added 3 commits October 18, 2024 12:56

Update tested node versions

832b5b7

Bump versions

b871c08

[version] Update to 3.0.0

7a58d6e

xenova merged commit 7ebd50c into main Oct 18, 2024
4 checks passed

tharvik mentioned this pull request Nov 1, 2024

Fix and rework GPT-TF.js epfml/disco#807

Merged

Weldawadyathink mentioned this pull request Nov 24, 2024

[Bug]: Transformers.js and onnx runtimes do not work in Deno Deploy denoland/deploy_feedback#762

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

xenova commented Jan 27, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 27, 2024

Huguet57 commented Jan 31, 2024

kishorekaruppusamy commented Feb 6, 2024

jhpassion0621 commented Feb 12, 2024

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

NawarA commented Mar 6, 2024

xenova commented Mar 9, 2024

khmyznikov commented Mar 11, 2024

xenova commented Mar 11, 2024 •

edited

Loading

xenova commented Mar 13, 2024

beaufortfrancois commented Mar 14, 2024 •

edited

Loading

beaufortfrancois Mar 14, 2024

xenova Mar 14, 2024

do-me commented Oct 18, 2024

kungfooman commented Oct 18, 2024

young-developer commented Oct 18, 2024 •

edited

Loading

flatsiedatsie commented Oct 18, 2024

gyagp commented Oct 18, 2024

okasi commented Oct 19, 2024

justin0mcateer commented Oct 19, 2024 via email •

edited by xenova

Loading

		@@ -0,0 +1,3 @@
		/**
		* @typedef {'cpu'\|'gpu'\|'wasm'\|'webgpu'\|null} DeviceType

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

🚀🚀🚀 Transformers.js V3 🚀🚀🚀 #545

Conversation

xenova commented Jan 27, 2024 • edited Loading

Useful commands:

How to use WebGPU

HuggingFaceDocBuilderDev commented Jan 27, 2024

Huguet57 commented Jan 31, 2024

kishorekaruppusamy commented Feb 6, 2024

jhpassion0621 commented Feb 12, 2024

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

NawarA commented Mar 6, 2024

xenova commented Mar 9, 2024

khmyznikov commented Mar 11, 2024

xenova commented Mar 11, 2024 • edited Loading

xenova commented Mar 13, 2024

beaufortfrancois commented Mar 14, 2024 • edited Loading

beaufortfrancois Mar 14, 2024

Choose a reason for hiding this comment

xenova Mar 14, 2024

Choose a reason for hiding this comment

do-me commented Oct 18, 2024

kungfooman commented Oct 18, 2024

young-developer commented Oct 18, 2024 • edited Loading

flatsiedatsie commented Oct 18, 2024

gyagp commented Oct 18, 2024

okasi commented Oct 19, 2024

justin0mcateer commented Oct 19, 2024 via email • edited by xenova Loading

xenova commented Jan 27, 2024 •

edited

Loading

xenova commented Mar 11, 2024 •

edited

Loading

beaufortfrancois commented Mar 14, 2024 •

edited

Loading

young-developer commented Oct 18, 2024 •

edited

Loading

justin0mcateer commented Oct 19, 2024 via email •

edited by xenova

Loading