-State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
+
+
State-of-the-art Machine Learning for the Web
+
+
+Run 🤗 Transformers directly in your browser, with no need for a server!
Transformers.js is designed to be functionally equivalent to Hugging Face's [transformers](https://github.com/huggingface/transformers) python library, meaning you can run the same pretrained models using a very similar API. These models support common tasks in different modalities, such as:
- 📝 **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
diff --git a/docs/scripts/build_readme.py b/docs/scripts/build_readme.py
index 611c5b3f6..49a2e6400 100644
--- a/docs/scripts/build_readme.py
+++ b/docs/scripts/build_readme.py
@@ -13,21 +13,11 @@
{intro}
diff --git a/docs/snippets/0_introduction.snippet b/docs/snippets/0_introduction.snippet
index d25a0e513..34d71bccb 100644
--- a/docs/snippets/0_introduction.snippet
+++ b/docs/snippets/0_introduction.snippet
@@ -1,5 +1,9 @@
-State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
+
+
State-of-the-art Machine Learning for the Web
+
+
+Run 🤗 Transformers directly in your browser, with no need for a server!
Transformers.js is designed to be functionally equivalent to Hugging Face's [transformers](https://github.com/huggingface/transformers) python library, meaning you can run the same pretrained models using a very similar API. These models support common tasks in different modalities, such as:
- 📝 **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
From 270eee88c3aa488e1e24add70fdf3b1b7bb423f5 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 02:23:53 +0000
Subject: [PATCH 02/14] Add WebGPU + dtypes docs
---
docs/source/_toctree.yml | 6 +-
docs/source/guides/dtypes.md | 130 +++++++++++++++++++++++++++++++++++
docs/source/guides/webgpu.md | 87 +++++++++++++++++++++++
3 files changed, 222 insertions(+), 1 deletion(-)
create mode 100644 docs/source/guides/dtypes.md
create mode 100644 docs/source/guides/webgpu.md
diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
index 4458c049b..d0b622528 100644
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -23,10 +23,14 @@
title: Server-side Inference in Node.js
title: Tutorials
- sections:
+ - local: guides/webgpu
+ title: Running models on WebGPU
+ - local: guides/dtypes
+ title: Using quantized models (dtypes)
- local: guides/private
title: Accessing Private/Gated Models
- local: guides/node-audio-processing
- title: Server-side Audio Processing in Node.js
+ title: Server-side Audio Processing
title: Developer Guides
- sections:
- local: api/transformers
diff --git a/docs/source/guides/dtypes.md b/docs/source/guides/dtypes.md
new file mode 100644
index 000000000..a479e1f95
--- /dev/null
+++ b/docs/source/guides/dtypes.md
@@ -0,0 +1,130 @@
+# Using quantized models (dtypes)
+
+Before Transformers.js v3, we used the `quantized` option to specify whether to use a quantized (q8) or full-precision (fp32) variant of the model by setting `quantized` to `true` or `false`, respectively. Now, we've added the ability to select from a much larger list with the `dtype` parameter.
+
+The list of available quantizations depends on the model, but some common ones are: full-precision (`"fp32"`), half-precision (`"fp16"`), 8-bit (`"q8"`, `"int8"`, `"uint8"`), and 4-bit (`"q4"`, `"bnb4"`, `"q4f16"`).
+
+
+
+## Basic usage
+
+**Example:** Run Qwen2.5-0.5B-Instruct in 4-bit quantization ([demo](https://v2.scrimba.com/s0dlcpv0ci))
+
+```js
+import { pipeline } from "@huggingface/transformers";
+
+// Create a text generation pipeline
+const generator = await pipeline(
+ "text-generation",
+ "onnx-community/Qwen2.5-0.5B-Instruct",
+ { dtype: "q4", device: "webgpu" },
+);
+
+// Define the list of messages
+const messages = [
+ { role: "system", content: "You are a helpful assistant." },
+ { role: "user", content: "Tell me a funny joke." },
+];
+
+// Generate a response
+const output = await generator(messages, { max_new_tokens: 128 });
+console.log(output[0].generated_text.at(-1).content);
+```
+
+## Per-module dtypes
+
+Some encoder-decoder models, like Whisper or Florence-2, are extremely sensitive to quantization settings: especially of the encoder. For this reason, we added the ability to select per-module dtypes, which can be done by providing a mapping from module name to dtype.
+
+**Example:** Run Florence-2 on WebGPU ([demo](https://v2.scrimba.com/s0pdm485fo))
+
+```js
+import { Florence2ForConditionalGeneration } from "@huggingface/transformers";
+
+const model = await Florence2ForConditionalGeneration.from_pretrained(
+ "onnx-community/Florence-2-base-ft",
+ {
+ dtype: {
+ embed_tokens: "fp16",
+ vision_encoder: "fp16",
+ encoder_model: "q4",
+ decoder_model_merged: "q4",
+ },
+ device: "webgpu",
+ },
+);
+```
+
+
+
+
+
+
+
+See full code example
+
+
+```js
+import {
+ Florence2ForConditionalGeneration,
+ AutoProcessor,
+ AutoTokenizer,
+ RawImage,
+} from "@huggingface/transformers";
+
+// Load model, processor, and tokenizer
+const model_id = "onnx-community/Florence-2-base-ft";
+const model = await Florence2ForConditionalGeneration.from_pretrained(
+ model_id,
+ {
+ dtype: {
+ embed_tokens: "fp16",
+ vision_encoder: "fp16",
+ encoder_model: "q4",
+ decoder_model_merged: "q4",
+ },
+ device: "webgpu",
+ },
+);
+const processor = await AutoProcessor.from_pretrained(model_id);
+const tokenizer = await AutoTokenizer.from_pretrained(model_id);
+
+// Load image and prepare vision inputs
+const url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg";
+const image = await RawImage.fromURL(url);
+const vision_inputs = await processor(image);
+
+// Specify task and prepare text inputs
+const task = "";
+const prompts = processor.construct_prompts(task);
+const text_inputs = tokenizer(prompts);
+
+// Generate text
+const generated_ids = await model.generate({
+ ...text_inputs,
+ ...vision_inputs,
+ max_new_tokens: 100,
+});
+
+// Decode generated text
+const generated_text = tokenizer.batch_decode(generated_ids, {
+ skip_special_tokens: false,
+})[0];
+
+// Post-process the generated text
+const result = processor.post_process_generation(
+ generated_text,
+ task,
+ image.size,
+);
+console.log(result);
+// { '': 'A green car is parked in front of a tan building. The building has a brown door and two brown windows. The car is a two door and the door is closed. The green car has black tires.' }
+```
+
+
diff --git a/docs/source/guides/webgpu.md b/docs/source/guides/webgpu.md
new file mode 100644
index 000000000..378d3ee8b
--- /dev/null
+++ b/docs/source/guides/webgpu.md
@@ -0,0 +1,87 @@
+# Running models on WebGPU
+
+WebGPU is a new web standard for accelerated graphics and compute. The [API](https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API) enables web developers to use the underlying system's GPU to carry out high-performance computations directly in the browser. WebGPU is the successor to [WebGL](https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API) and provides significantly better performance, because it allows for more direct interaction with modern GPUs. Lastly, it supports general-purpose GPU computations, which makes it just perfect for machine learning!
+
+> [!WARNING]
+> As of October 2024, global WebGPU support is around 70% (according to [caniuse.com](https://caniuse.com/webgpu)), meaning some users may not be able to use the API.
+>
+> If the following demos do not work in your browser, you may need to enable it using a feature flag:
+>
+> - Firefox: with the `dom.webgpu.enabled` flag (see [here](https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Experimental_features#:~:text=tested%20by%20Firefox.-,WebGPU%20API,-The%20WebGPU%20API)).
+> - Safari: with the `WebGPU` feature flag (see [here](https://webkit.org/blog/14879/webgpu-now-available-for-testing-in-safari-technology-preview/)).
+> - Older Chromium browsers (on Windows, macOS, Linux): with the `enable-unsafe-webgpu` flag (see [here](https://developer.chrome.com/docs/web-platform/webgpu/troubleshooting-tips)).
+
+## Usage in Transformers.js v3
+
+Thanks to our collaboration with [ONNX Runtime Web](https://www.npmjs.com/package/onnxruntime-web), enabling WebGPU acceleration is as simple as setting `device: 'webgpu'` when loading a model. Let's see some examples!
+
+**Example:** Compute text embeddings on WebGPU ([demo](https://v2.scrimba.com/s06a2smeej))
+
+```js
+import { pipeline } from "@huggingface/transformers";
+
+// Create a feature-extraction pipeline
+const extractor = await pipeline(
+ "feature-extraction",
+ "mixedbread-ai/mxbai-embed-xsmall-v1",
+ { device: "webgpu" },
+});
+
+// Compute embeddings
+const texts = ["Hello world!", "This is an example sentence."];
+const embeddings = await extractor(texts, { pooling: "mean", normalize: true });
+console.log(embeddings.tolist());
+// [
+// [-0.016986183822155, 0.03228696808218956, -0.0013630966423079371, ... ],
+// [0.09050482511520386, 0.07207386940717697, 0.05762749910354614, ... ],
+// ]
+```
+
+**Example:** Perform automatic speech recognition with OpenAI whisper on WebGPU ([demo](https://v2.scrimba.com/s0oi76h82g))
+
+```js
+import { pipeline } from "@huggingface/transformers";
+
+// Create automatic speech recognition pipeline
+const transcriber = await pipeline(
+ "automatic-speech-recognition",
+ "onnx-community/whisper-tiny.en",
+ { device: "webgpu" },
+);
+
+// Transcribe audio from a URL
+const url = "https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/jfk.wav";
+const output = await transcriber(url);
+console.log(output);
+// { text: ' And so my fellow Americans ask not what your country can do for you, ask what you can do for your country.' }
+```
+
+**Example:** Perform image classification with MobileNetV4 on WebGPU ([demo](https://v2.scrimba.com/s0fv2uab1t))
+
+```js
+import { pipeline } from "@huggingface/transformers";
+
+// Create image classification pipeline
+const classifier = await pipeline(
+ "image-classification",
+ "onnx-community/mobilenetv4_conv_small.e2400_r224_in1k",
+ { device: "webgpu" },
+);
+
+// Classify an image from a URL
+const url = "https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg";
+const output = await classifier(url);
+console.log(output);
+// [
+// { label: 'tiger, Panthera tigris', score: 0.6149784922599792 },
+// { label: 'tiger cat', score: 0.30281734466552734 },
+// { label: 'tabby, tabby cat', score: 0.0019135422771796584 },
+// { label: 'lynx, catamount', score: 0.0012161266058683395 },
+// { label: 'Egyptian cat', score: 0.0011465961579233408 }
+// ]
+```
+
+## Reporting bugs and providing feedback
+
+Due to the experimental nature of the WebGPU API, especially in non-Chromium browsers, you may
+
From a8be5b983b2b58cf2fd7bcda266dcf8e9737ee9f Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 03:56:04 +0000
Subject: [PATCH 03/14] Add link to transformers.js example repo
---
docs/snippets/3_examples.snippet | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/docs/snippets/3_examples.snippet b/docs/snippets/3_examples.snippet
index f8bf7ed1c..4138482f6 100644
--- a/docs/snippets/3_examples.snippet
+++ b/docs/snippets/3_examples.snippet
@@ -1,4 +1,4 @@
-Want to jump straight in? Get started with one of our sample applications/templates:
+Want to jump straight in? Get started with one of our sample applications/templates, which can be found [here](https://github.com/huggingface/transformers.js-examples).
| Name | Description | Links |
|-------------------|----------------------------------|-------------------------------|
From a1a668f4744ab70bf48d5efb514e5915db9e5030 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 03:59:41 +0000
Subject: [PATCH 04/14] Improve quick tour docs
---
docs/snippets/1_quick-tour.snippet | 33 +++++++++++++++++++++++++++---
1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index 2e906a0f1..e220046e9 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -26,9 +26,9 @@ out = pipe('I love transformers!')
import { pipeline } from '@huggingface/transformers';
// Allocate a pipeline for sentiment-analysis
-let pipe = await pipeline('sentiment-analysis');
+const pipe = await pipeline('sentiment-analysis');
-let out = await pipe('I love transformers!');
+const out = await pipe('I love transformers!');
// [{'label': 'POSITIVE', 'score': 0.999817686}]
```
@@ -40,5 +40,32 @@ let out = await pipe('I love transformers!');
You can also use a different model by specifying the model id or path as the second argument to the `pipeline` function. For example:
```javascript
// Use a different model for sentiment-analysis
-let pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
+const pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
+```
+
+By default, when running in the browser, the model will be run on your CPU (via WASM). If you would like
+to run the model on your GPU (via WebGPU), you can do this by setting `device: 'webgpu'`, for example:
+```javascript
+// Run the model on WebGPU
+const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
+ device: 'webgpu'
+});
+```
+
+For more information, check out the [WebGPU guide](./guides/webgpu).
+
+> [!NOTE]
+> The WebGPU API is still experimental in many browsers, so if you run into any issues, please file a bug report
+> [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+
+In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
+the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
+which allows you to select the appropriate data type for your model. While the available options may vary
+depending on the specific model, typical choices include `"fp32"` (default for WebGPU), `"fp16"`, `"q8"`
+(default for WASM), and `"q4"`. For more information, check out the [quantization guide](./guides/dtypes).
+```javascript
+// Run the model at 4-bit quantization
+const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
+ dtype: 'q4',
+});
```
From 5d906582a91045106082de9b1f046fb83e23f4d1 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 03:59:52 +0000
Subject: [PATCH 05/14] Build README
---
README.md | 35 +++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index c5b48e908..5b5823c11 100644
--- a/README.md
+++ b/README.md
@@ -66,9 +66,9 @@ out = pipe('I love transformers!')
import { pipeline } from '@huggingface/transformers';
// Allocate a pipeline for sentiment-analysis
-let pipe = await pipeline('sentiment-analysis');
+const pipe = await pipeline('sentiment-analysis');
-let out = await pipe('I love transformers!');
+const out = await pipe('I love transformers!');
// [{'label': 'POSITIVE', 'score': 0.999817686}]
```
@@ -80,7 +80,34 @@ let out = await pipe('I love transformers!');
You can also use a different model by specifying the model id or path as the second argument to the `pipeline` function. For example:
```javascript
// Use a different model for sentiment-analysis
-let pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
+const pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
+```
+
+By default, when running in the browser, the model will be run on your CPU (via WASM). If you would like
+to run the model on your GPU (via WebGPU), you can do this by setting `device: 'webgpu'`, for example:
+```javascript
+// Run the model on WebGPU
+const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
+ device: 'webgpu'
+});
+```
+
+For more information, check out the [WebGPU guide](./guides/webgpu).
+
+> [!NOTE]
+> The WebGPU API is still experimental in many browsers, so if you run into any issues, please file a bug report
+> [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+
+In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
+the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
+which allows you to select the appropriate data type for your model. While the available options may vary
+depending on the specific model, typical choices include `"fp32"` (default for WebGPU), `"fp16"`, `"q8"`
+(default for WASM), and `"q4"`. For more information, check out the [quantization guide](./guides/dtypes).
+```javascript
+// Run the model at 4-bit quantization
+const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
+ dtype: 'q4',
+});
```
@@ -102,7 +129,7 @@ Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN
## Examples
-Want to jump straight in? Get started with one of our sample applications/templates:
+Want to jump straight in? Get started with one of our sample applications/templates, which can be found [here](https://github.com/huggingface/transformers.js-examples).
| Name | Description | Links |
|-------------------|----------------------------------|-------------------------------|
From f723dfd55389a19743f84e5144d8a2f438135f88 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:07:10 +0000
Subject: [PATCH 06/14] Update quick tour
---
README.md | 4 ++--
docs/snippets/1_quick-tour.snippet | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index 5b5823c11..94a388884 100644
--- a/README.md
+++ b/README.md
@@ -95,8 +95,8 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
For more information, check out the [WebGPU guide](./guides/webgpu).
> [!NOTE]
-> The WebGPU API is still experimental in many browsers, so if you run into any issues, please file a bug report
-> [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+> The WebGPU API is still experimental in many browsers, so if you run into any issues,
+> please file a bug report [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index e220046e9..912dc1c5b 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -55,8 +55,8 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
For more information, check out the [WebGPU guide](./guides/webgpu).
> [!NOTE]
-> The WebGPU API is still experimental in many browsers, so if you run into any issues, please file a bug report
-> [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+> The WebGPU API is still experimental in many browsers, so if you run into any issues,
+> please file a bug report [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
From 46550903e80e5de68e945407dd407d64d167152e Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:08:16 +0000
Subject: [PATCH 07/14] `let` -> `const`
---
docs/source/pipelines.md | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/docs/source/pipelines.md b/docs/source/pipelines.md
index 0c1b3d584..3e1ad6b15 100644
--- a/docs/source/pipelines.md
+++ b/docs/source/pipelines.md
@@ -16,7 +16,7 @@ Start by creating an instance of `pipeline()` and specifying a task you want to
```javascript
import { pipeline } from '@huggingface/transformers';
-let classifier = await pipeline('sentiment-analysis');
+const classifier = await pipeline('sentiment-analysis');
```
When running for the first time, the `pipeline` will download and cache the default pretrained model associated with the task. This can take a while, but subsequent calls will be much faster.
@@ -30,14 +30,14 @@ By default, models will be downloaded from the [Hugging Face Hub](https://huggin
You can now use the classifier on your target text by calling it as a function:
```javascript
-let result = await classifier('I love transformers!');
+const result = await classifier('I love transformers!');
// [{'label': 'POSITIVE', 'score': 0.9998}]
```
If you have multiple inputs, you can pass them as an array:
```javascript
-let result = await classifier(['I love transformers!', 'I hate transformers!']);
+const result = await classifier(['I love transformers!', 'I hate transformers!']);
// [{'label': 'POSITIVE', 'score': 0.9998}, {'label': 'NEGATIVE', 'score': 0.9982}]
```
@@ -46,9 +46,9 @@ You can also specify a different model to use for the pipeline by passing it as
```javascript
-let reviewer = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
+const reviewer = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
-let result = await reviewer('The Shawshank Redemption is a true masterpiece of cinema.');
+const result = await reviewer('The Shawshank Redemption is a true masterpiece of cinema.');
// [{label: '5 stars', score: 0.8167929649353027}]
```
@@ -59,10 +59,10 @@ The `pipeline()` function is a great way to quickly use a pretrained model for i
```javascript
// Allocate a pipeline for Automatic Speech Recognition
-let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-small.en');
+const transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-small.en');
// Transcribe an audio file, loaded from a URL.
-let result = await transcriber('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac');
+const result = await transcriber('https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac');
// {text: ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}
```
@@ -86,7 +86,7 @@ You can also specify which revision of the model to use, by passing a `revision`
Since the Hugging Face Hub uses a git-based versioning system, you can use any valid git revision specifier (e.g., branch name or commit hash)
```javascript
-let transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en', {
+const transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-tiny.en', {
revision: 'output_attentions',
});
```
@@ -100,17 +100,17 @@ Many pipelines have additional options that you can specify. For example, when u
```javascript
// Allocation a pipeline for translation
-let translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
+const translator = await pipeline('translation', 'Xenova/nllb-200-distilled-600M');
// Translate from English to Greek
-let result = await translator('I like to walk my dog.', {
+const result = await translator('I like to walk my dog.', {
src_lang: 'eng_Latn',
tgt_lang: 'ell_Grek'
});
// [ { translation_text: 'Μου αρέσει να περπατάω το σκυλί μου.' } ]
// Translate back to English
-let result2 = await translator(result[0].translation_text, {
+const result2 = await translator(result[0].translation_text, {
src_lang: 'ell_Grek',
tgt_lang: 'eng_Latn'
});
@@ -125,8 +125,8 @@ For example, to generate a poem using `LaMini-Flan-T5-783M`, you can do:
```javascript
// Allocate a pipeline for text2text-generation
-let poet = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-783M');
-let result = await poet('Write me a love poem about cheese.', {
+const poet = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-783M');
+const result = await poet('Write me a love poem about cheese.', {
max_new_tokens: 200,
temperature: 0.9,
repetition_penalty: 2.0,
From 80af44027d0d6b914361dea5f1f9f41b9a45e4eb Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:11:24 +0000
Subject: [PATCH 08/14] Update quick tour again
---
README.md | 2 +-
docs/snippets/1_quick-tour.snippet | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 94a388884..b621b5aa3 100644
--- a/README.md
+++ b/README.md
@@ -96,7 +96,7 @@ For more information, check out the [WebGPU guide](./guides/webgpu).
> [!NOTE]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
-> please file a bug report [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index 912dc1c5b..eebb12a1c 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -56,7 +56,7 @@ For more information, check out the [WebGPU guide](./guides/webgpu).
> [!NOTE]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
-> please file a bug report [here](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
From 593fcb45eae72513f7fa0e79dbd55e37d256b449 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:13:16 +0000
Subject: [PATCH 09/14] Fix link
---
README.md | 2 +-
docs/snippets/1_quick-tour.snippet | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index b621b5aa3..b03b6d17a 100644
--- a/README.md
+++ b/README.md
@@ -96,7 +96,7 @@ For more information, check out the [WebGPU guide](./guides/webgpu).
> [!NOTE]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
-> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index eebb12a1c..c182dad33 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -56,7 +56,7 @@ For more information, check out the [WebGPU guide](./guides/webgpu).
> [!NOTE]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
-> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=[WebGPU]%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
+> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
From be93b0feba23b93da3636465cfa862966ea77f5d Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:15:34 +0000
Subject: [PATCH 10/14] NOTE -> WARNING
---
README.md | 2 +-
docs/snippets/1_quick-tour.snippet | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index b03b6d17a..8630353b3 100644
--- a/README.md
+++ b/README.md
@@ -94,7 +94,7 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
For more information, check out the [WebGPU guide](./guides/webgpu).
-> [!NOTE]
+> [!WARNING]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index c182dad33..24a883cc6 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -54,7 +54,7 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
For more information, check out the [WebGPU guide](./guides/webgpu).
-> [!NOTE]
+> [!WARNING]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
From 31719cc58eb1675126a97336ba09c3be09a1a5c7 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:16:31 +0000
Subject: [PATCH 11/14] Add trailing comma
---
README.md | 2 +-
docs/snippets/1_quick-tour.snippet | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 8630353b3..dc343c77e 100644
--- a/README.md
+++ b/README.md
@@ -88,7 +88,7 @@ to run the model on your GPU (via WebGPU), you can do this by setting `device: '
```javascript
// Run the model on WebGPU
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
- device: 'webgpu'
+ device: 'webgpu',
});
```
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index 24a883cc6..7ccd724d4 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -48,7 +48,7 @@ to run the model on your GPU (via WebGPU), you can do this by setting `device: '
```javascript
// Run the model on WebGPU
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
- device: 'webgpu'
+ device: 'webgpu',
});
```
From e1de86a7bb8eb1fc43ff4f6c837486d24ae53b79 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:21:26 +0000
Subject: [PATCH 12/14] Remove trailing space
---
README.md | 2 +-
docs/snippets/1_quick-tour.snippet | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index dc343c77e..41c918470 100644
--- a/README.md
+++ b/README.md
@@ -94,7 +94,7 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
For more information, check out the [WebGPU guide](./guides/webgpu).
-> [!WARNING]
+> [!WARNING]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index 7ccd724d4..27fdd2214 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -54,7 +54,7 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
For more information, check out the [WebGPU guide](./guides/webgpu).
-> [!WARNING]
+> [!WARNING]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
From 0432df6c2deca342ee4e58d7ee08a79d35f8bb55 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 04:25:09 +0000
Subject: [PATCH 13/14] Swap installation and quick tour sections
---
README.md | 32 ++++++++++++++++----------------
docs/scripts/build_readme.py | 8 ++++----
2 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/README.md b/README.md
index 41c918470..c2c0cb6a8 100644
--- a/README.md
+++ b/README.md
@@ -36,6 +36,22 @@ Transformers.js uses [ONNX Runtime](https://onnxruntime.ai/) to run models in th
For more information, check out the full [documentation](https://huggingface.co/docs/transformers.js).
+## Installation
+
+
+To install via [NPM](https://www.npmjs.com/package/@huggingface/transformers), run:
+```bash
+npm i @huggingface/transformers
+```
+
+Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN or static hosting. For example, using [ES Modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules), you can import the library with:
+```html
+
+```
+
+
## Quick tour
@@ -111,22 +127,6 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
```
-## Installation
-
-
-To install via [NPM](https://www.npmjs.com/package/@huggingface/transformers), run:
-```bash
-npm i @huggingface/transformers
-```
-
-Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN or static hosting. For example, using [ES Modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules), you can import the library with:
-```html
-
-```
-
-
## Examples
Want to jump straight in? Get started with one of our sample applications/templates, which can be found [here](https://github.com/huggingface/transformers.js-examples).
diff --git a/docs/scripts/build_readme.py b/docs/scripts/build_readme.py
index 49a2e6400..84bb30cf0 100644
--- a/docs/scripts/build_readme.py
+++ b/docs/scripts/build_readme.py
@@ -22,14 +22,14 @@
{intro}
-## Quick tour
-
-{quick_tour}
-
## Installation
{installation}
+## Quick tour
+
+{quick_tour}
+
## Examples
{examples}
From 96b30ae25ecf9b849522568496c305a8f62f8c66 Mon Sep 17 00:00:00 2001
From: Joshua Lochner
Date: Tue, 22 Oct 2024 05:26:26 +0000
Subject: [PATCH 14/14] Fix guide URLs in README
---
README.md | 4 ++--
docs/snippets/1_quick-tour.snippet | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index c2c0cb6a8..6f7ed3f70 100644
--- a/README.md
+++ b/README.md
@@ -108,7 +108,7 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
});
```
-For more information, check out the [WebGPU guide](./guides/webgpu).
+For more information, check out the [WebGPU guide](https://huggingface.co/docs/transformers.js/guides/webgpu).
> [!WARNING]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
@@ -118,7 +118,7 @@ In resource-constrained environments, such as web browsers, it is advisable to u
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
which allows you to select the appropriate data type for your model. While the available options may vary
depending on the specific model, typical choices include `"fp32"` (default for WebGPU), `"fp16"`, `"q8"`
-(default for WASM), and `"q4"`. For more information, check out the [quantization guide](./guides/dtypes).
+(default for WASM), and `"q4"`. For more information, check out the [quantization guide](https://huggingface.co/docs/transformers.js/guides/dtypes).
```javascript
// Run the model at 4-bit quantization
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
diff --git a/docs/snippets/1_quick-tour.snippet b/docs/snippets/1_quick-tour.snippet
index 27fdd2214..ddf4d5744 100644
--- a/docs/snippets/1_quick-tour.snippet
+++ b/docs/snippets/1_quick-tour.snippet
@@ -52,7 +52,7 @@ const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncase
});
```
-For more information, check out the [WebGPU guide](./guides/webgpu).
+For more information, check out the [WebGPU guide](/guides/webgpu).
> [!WARNING]
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
@@ -62,7 +62,7 @@ In resource-constrained environments, such as web browsers, it is advisable to u
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
which allows you to select the appropriate data type for your model. While the available options may vary
depending on the specific model, typical choices include `"fp32"` (default for WebGPU), `"fp16"`, `"q8"`
-(default for WASM), and `"q4"`. For more information, check out the [quantization guide](./guides/dtypes).
+(default for WASM), and `"q4"`. For more information, check out the [quantization guide](/guides/dtypes).
```javascript
// Run the model at 4-bit quantization
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {