Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] type object 'TasksManager' has no attribute '_TASKS_TO_AUTOMODELS' #297

Closed
chrisfel-dev opened this issue Sep 11, 2023 · 10 comments
Closed
Labels
bug Something isn't working

Comments

@chrisfel-dev
Copy link

Describe the bug
I am trying to run python -m scripts.convert --quantize --model_id bert-base-uncased and I am getting the following error

How to reproduce
Steps or a minimal working example to reproduce the behavior

  1. Downloaded latest transformers.js code
  2. Ran python -m scripts.convert --quantize --model_id bert-base-uncased
  3. Got error
(base) chris@Chriss-MacBook-Pro transformers.js-2.6.0 % python -m scripts.convert --quantize --model_id bert-base-uncased
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/chris/Downloads/transformers.js-2.6.0/scripts/convert.py", line 89, in <module>
    class ConversionArguments:
  File "/Users/chris/Downloads/transformers.js-2.6.0/scripts/convert.py", line 117, in ConversionArguments
    f" {str(list(TasksManager._TASKS_TO_AUTOMODELS.keys()))}. For decoder models, use `xxx-with-past` to export the model using past key values in the decoder."
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'TasksManager' has no attribute '_TASKS_TO_AUTOMODELS'

Expected behavior
I should get an ONNX model I can use with "@xenova/transformers"

Environment

  • Transformers.js version: 2.6.0
  • Browser (if applicable): n/a. This will run in node.js
  • Operating system (if applicable): Mac OS
  • Other: M2 Max Mac

Additional context
I have tried other models with the same error e.g. meta-llama/Llama-2-7b-chat-hf

@chrisfel-dev chrisfel-dev added the bug Something isn't working label Sep 11, 2023
@xenova
Copy link
Collaborator

xenova commented Sep 11, 2023

Hi there 👋 this was fixed yesterday (and was caused by a new version of optimum). If you use the current conversion script (located in the scripts folder), it should work 😁

@chrisfel-dev
Copy link
Author

chrisfel-dev commented Sep 12, 2023

OK. Looks like it is running now :) The other issue I see is that decoder_model_merged_quantized.onnx is not generated. Is there some special flag I need for this to be generated? Thanks!

error: `local_files_only=true` or `env.allowRemoteModels=false` and file was not found locally at "/Users/chris/Documents/personal-github/transformers.js/models/meta-llama/Llama-2-7b-chat-hf/onnx/decoder_model_merged_quantized.onnx".

zsh: killed python -m scripts.convert --quantize --model_id meta-llama/Llama-2-7b-chat-hf
/Users/chris/Documents/conda/anaconda3/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

@xenova
Copy link
Collaborator

xenova commented Sep 12, 2023

zsh: killed python -m scripts.convert --quantize --model_id meta-llama/Llama-2-7b-chat-hf
/Users/chris/Documents/conda/anaconda3/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

It looks like the process ran out of memory. Could you try add --skip_validation to the command? Also, @fxmarty may have some insights here as he has been converting llama to ONNX recently.

Just note that at the moment, we do not support models larger than 2GB (due to the protobuf limit combined with onnxruntime-node not supporting the external file format). This will most likely be fixed in the next version of onnxruntime-node (>1.16), thanks to contributions from @dakenf! 🤗

@chrisfel-dev
Copy link
Author

Thanks --skip_validation allowed it to complete! Onto the next error though. I have some simple code to serve the model over an express server. Nothing much there, but this

import express from "express";
import{ env,pipeline } from "@xenova/transformers";

env.localModelPath = "/Users/chris/Documents/personal-github/transformers.js/models";
env.allowRemoteModels = false;
const app = express();
const port = 8080;

app.get("/", async (req, res) => {
  let gen = await pipeline('text-generation', 'meta-llama/Llama-2-7b-chat-hf');
  const text = await gen("hello there");
  console.log(text)
  res.send(text);
});

app.listen(port, () => {
  console.log(`Listening on port ${port}...`);
});

I am getting the following error:
565 | console.log(contentLength)
566 | if (contentLength === null) {
567 | console.warn('Unable to determine content-length from response headers. Will expand buffer when needed.')
568 | }
569 | let total = parseInt(contentLength ?? '0');
570 | let buffer = new Uint8Array(total);
^
RangeError: length too large
at /Users/chris/Documents/personal-github/gpt-nx/node_modules/@xenova/transformers/src/utils/hub.js:570:17
at readResponse (/Users/chris/Documents/personal-github/gpt-nx/node_modules/@xenova/transformers/src/utils/hub.js:561:28)
at /Users/chris/Documents/personal-github/gpt-nx/node_modules/@xenova/transformers/src/utils/hub.js:493:25
at processTicksAndRejections (:1:2602)

@dakenf
Copy link

dakenf commented Sep 13, 2023

It is possible to pass filename to onnx runtime in node and it will load anything that fits into RAM/VRAM but you'll need to tamper with hub.js in node_modules, see #123
I believe this will be resolved with next onnx release once we settle with a way to pass weights file for web version

Also, you can run it on GPU with latest changes microsoft/onnxruntime#16050
There should be a nightly build somewhere so you won't need to build it by yourself

@chrisfel-dev
Copy link
Author

OK. I am getting a segmentation fault and not sure why. I guess the original issue I opened this ticket for is resolved. Not sure where to go from here on this latest issue.

app.get("/", async (req, res) => {
let gen = await pipeline('text-generation', 'meta-llama/Llama-2-7b-chat-hf',{
quantized: false,
local_files_only: true
});
const text = await gen("hello there");
console.log(text)
res.send(text);
});

@chrisfel-dev
Copy link
Author

Will open another issue. For latest troubles thanks.

@xenova
Copy link
Collaborator

xenova commented Sep 13, 2023

Please see the above comment: #297 (comment)

Transformers.js does not yet support models larger than 2GB. This is due to the support lacking in onnxruntime 1.14.0

@dakenf
Copy link

dakenf commented Sep 13, 2023

Transformers.js does not yet support models larger than 2GB

I'm now working to bypass 4gb limit in WASM FS that is required for SDXL (and other models like LLAMA), so soon JS/TS will rule the inference as the most accessible way for developers. So you can expect much more stars and issues hehe

@chrisfel-dev
Copy link
Author

Thanks @dakenf that is great news!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants