Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text-to-image with StableDiffusion and gpu acceleration in node.js #121

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ __pycache__
.vscode
node_modules
.cache
.idea

# Do not track build artifacts/generated files
/dist
Expand Down
16 changes: 13 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,12 @@ Transformers.js is designed to be functionally equivalent to Hugging Face's [tra
- 📝 **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
- 🖼️ **Computer Vision**: image classification, object detection, and segmentation.
- 🗣️ **Audio**: automatic speech recognition and audio classification.
- 🐙 **Multimodal**: zero-shot image classification.
- 🐙 **Multimodal**: zero-shot image classification and text-to-image generation.

Transformers.js uses [ONNX Runtime](https://onnxruntime.ai/) to run models in the browser. The best part about it, is that you can easily [convert](#convert-your-models-to-onnx) your pretrained PyTorch, TensorFlow, or JAX models to ONNX using [🤗 Optimum](https://github.com/huggingface/optimum#onnx--onnx-runtime).

For more information, check out the full [documentation](https://huggingface.co/docs/transformers.js).


## Quick tour


Expand Down Expand Up @@ -102,6 +101,17 @@ Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN
</script>
```

## GPU acceleration in Node.js:

**Windows and macOS**:

Works out of the box

**Linux**:
1. Install CUDA https://docs.nvidia.com/cuda/cuda-installation-guide-linux/
2. Install cuDNN https://developer.nvidia.com/rdp/cudnn-archive
3. Install onnxruntime-linux-x64-gpu-1.14.1 https://github.com/microsoft/onnxruntime/releases/tag/v1.14.1


## Examples

Expand Down Expand Up @@ -230,7 +240,7 @@ to open up a feature request [here](https://github.com/xenova/transformers.js/is
| [Document Question Answering](https://huggingface.co/tasks/document-question-answering) | `document-question-answering` | Answering questions on document images. | ❌ |
| [Feature Extraction](https://huggingface.co/tasks/feature-extraction) | `feature-extraction` | Transforming raw data into numerical features that can be processed while preserving the information in the original dataset. | ✅ |
| [Image-to-Text](https://huggingface.co/tasks/image-to-text) | `image-to-text` | Output text from a given image. | ✅ |
| [Text-to-Image](https://huggingface.co/tasks/text-to-image) | `text-to-image` | Generates images from input text. | |
| [Text-to-Image](https://huggingface.co/tasks/text-to-image) | `text-to-image` | Generates images from input text. | |
| [Visual Question Answering](https://huggingface.co/tasks/visual-question-answering) | `visual-question-answering` | Answering open-ended questions based on an image. | ❌ |
| [Zero-Shot Image Classification](https://huggingface.co/tasks/zero-shot-image-classification) | `zero-shot-image-classification` | Classifying images into classes that are unseen during training. | ✅ |

Expand Down
2 changes: 1 addition & 1 deletion docs/snippets/0_introduction.snippet
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Transformers.js is designed to be functionally equivalent to Hugging Face's [tra
- 📝 **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
- 🖼️ **Computer Vision**: image classification, object detection, and segmentation.
- 🗣️ **Audio**: automatic speech recognition and audio classification.
- 🐙 **Multimodal**: zero-shot image classification.
- 🐙 **Multimodal**: zero-shot image classification and text-to-image generation.

Transformers.js uses [ONNX Runtime](https://onnxruntime.ai/) to run models in the browser. The best part about it, is that you can easily [convert](/custom_usage#convert-your-models-to-onnx) your pretrained PyTorch, TensorFlow, or JAX models to ONNX using [🤗 Optimum](https://github.com/huggingface/optimum#onnx--onnx-runtime).

Expand Down
11 changes: 11 additions & 0 deletions docs/snippets/2_installation.snippet
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,14 @@ Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers';
</script>
```

## GPU acceleration in Node.js:

**Windows and macOS**:

Works out of the box

**Linux**:
1. Install CUDA https://docs.nvidia.com/cuda/cuda-installation-guide-linux/
2. Install cuDNN https://developer.nvidia.com/rdp/cudnn-archive
3. Install onnxruntime-linux-x64-gpu-1.14.1 https://github.com/microsoft/onnxruntime/releases/tag/v1.14.1
20 changes: 8 additions & 12 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
"sharp": "^0.32.0"
},
"optionalDependencies": {
"onnxruntime-node": "^1.14.0"
"onnxruntime-node-gpu": "^1.14.0"
},
"devDependencies": {
"@types/jest": "^29.5.1",
Expand All @@ -68,12 +68,12 @@
"path": false,
"url": false,
"sharp": false,
"onnxruntime-node": false,
"onnxruntime-node-gpu": false,
"stream/web": false
},
"publishConfig": {
"access": "public"
},
"jsdelivr": "./dist/transformers.min.js",
"unpkg": "./dist/transformers.min.js"
}
}
24 changes: 18 additions & 6 deletions src/backends/onnx.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,23 @@
* So, we just import both packages, and use the appropriate one based on the environment:
* - When running in node, we use `onnxruntime-node`.
* - When running in the browser, we use `onnxruntime-web` (`onnxruntime-node` is not bundled).
*
*
* This module is not directly exported, but can be accessed through the environment variables:
* ```javascript
* import { env } from '@xenova/transformers';
* console.log(env.backends.onnx);
* ```
*
*
* @module backends/onnx
*/

// NOTE: Import order matters here. We need to import `onnxruntime-node` before `onnxruntime-web`.
import * as ONNX_NODE from 'onnxruntime-node';
import * as ONNX_NODE from 'onnxruntime-node-gpu';
import * as ONNX_WEB from 'onnxruntime-web';

export let ONNX;

export const executionProviders = [
export let executionProviders = [
// 'webgpu',
'wasm'
];
Expand All @@ -31,8 +31,20 @@ if (typeof process !== 'undefined' && process?.release?.name === 'node') {
// Running in a node-like environment.
ONNX = ONNX_NODE;

// Add `cpu` execution provider, with higher precedence that `wasm`.
executionProviders.unshift('cpu');
// Add `cpu` and os-specific execution provider
switch (process.platform) {
case 'darwin':
executionProviders = ['coreml', 'cpu'];
break;
case 'linux':
executionProviders = ['cuda', 'cpu']
break;
case 'win32':
executionProviders = ['directml', 'cpu']
break
default:
executionProviders = ['cpu']
}

} else {
// Running in a browser-environment
Expand Down
Loading