-
Notifications
You must be signed in to change notification settings - Fork 786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Janus-1.3B-ONNX - Can't create a session. Failed to allocate a buffer #1056
Comments
Can you try using WebGPU? Also, I recommend the following dtypes, depending on whether fp16 is supported or not. const model_id = "onnx-community/Janus-1.3B-ONNX";
const fp16_supported = true; // do feature check
const model = await MultiModalityCausalLM.from_pretrained(model_id, {
dtype: fp16_supported
? {
prepare_inputs_embeds: "q4",
language_model: "q4f16",
lm_head: "fp16",
gen_head: "fp16",
gen_img_embeds: "fp16",
image_decode: "fp32",
}
: {
prepare_inputs_embeds: "fp32",
language_model: "q4",
lm_head: "fp32",
gen_head: "fp32",
gen_img_embeds: "fp32",
image_decode: "fp32",
},
device: {
prepare_inputs_embeds: "wasm", // TODO use "webgpu" when bug is fixed
language_model: "webgpu",
lm_head: "webgpu",
gen_head: "webgpu",
gen_img_embeds: "webgpu",
image_decode: "webgpu",
},
}) |
It's working now, thanks, @xenova! Much appreciated. |
Is it possible to pass the following parameters to the |
Do you have an example for how to do this in the python library? My understanding is that the model generates exactly 576 tokens, so unless the image decoder can produce higher quality images, it can currently only generate 384x384 images. |
My bad, you're right—it's 384x384 after checking their docs. But what about |
You should be able to pass e.g., guidance_scale: 4 |
System Info
Transfomer.js version: "@huggingface/transformers": "^3.1.0"
Environment/Platform
Description
I was trying the code provided in the
onnx-community/Janus-1.3B-ONNX
repository, but I encountered the following error:I believe that 2079238052 bytes (approximately 1.94 GB) is less than 2 GB, so it shouldn't be causing this issue. Additionally, I noticed that the file
preprocessor_config.json
is being downloaded or loaded twice.Reproduction
The text was updated successfully, but these errors were encountered: