Janus-1.3B-ONNX - Can't create a session. Failed to allocate a buffer #1056

finalstack · 2024-11-27T18:37:16Z

System Info

Transfomer.js version: "@huggingface/transformers": "^3.1.0"

Environment/Platform

Description

I was trying the code provided in the onnx-community/Janus-1.3B-ONNX repository, but I encountered the following error:

ort.webgpu.bundle.min.mjs:2603 Uncaught Error: Can't create a session. Failed to allocate a buffer of 
size 2079238052.
    at jt (ort.webgpu.bundle.min.mjs:2603:25061)
    at Pr (ort.webgpu.bundle.min.mjs:2603:25240)
    at Kl (ort.webgpu.bundle.min.mjs:2603:34605)
    at mn.loadModel (ort.webgpu.bundle.min.mjs:2603:36389)
    at fn.createInferenceSessionHandler (ort.webgpu.bundle.min.mjs:2603:38145)
    at e.create (ort.webgpu.bundle.min.mjs:6:19471)
    at async createInferenceSession (onnx.js:163:1)
    at async models.js:301:1
    at async Promise.all (:5173/index 0)
    at async constructSessions (models.js:298:1)

I believe that 2079238052 bytes (approximately 1.94 GB) is less than 2 GB, so it shouldn't be causing this issue. Additionally, I noticed that the file preprocessor_config.json is being downloaded or loaded twice.

Reproduction

import { AutoProcessor, MultiModalityCausalLM } from "@huggingface/transformers";

const model_id = "onnx-community/Janus-1.3B-ONNX";
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await MultiModalityCausalLM.from_pretrained(model_id);

The text was updated successfully, but these errors were encountered:

xenova · 2024-11-28T01:47:11Z

Can you try using WebGPU? Also, I recommend the following dtypes, depending on whether fp16 is supported or not.

const model_id = "onnx-community/Janus-1.3B-ONNX";
const fp16_supported = true; // do feature check
const model = await MultiModalityCausalLM.from_pretrained(model_id, {
      dtype: fp16_supported
        ? {
            prepare_inputs_embeds: "q4",
            language_model: "q4f16",
            lm_head: "fp16",
            gen_head: "fp16",
            gen_img_embeds: "fp16",
            image_decode: "fp32",
          }
        : {
            prepare_inputs_embeds: "fp32",
            language_model: "q4",
            lm_head: "fp32",
            gen_head: "fp32",
            gen_img_embeds: "fp32",
            image_decode: "fp32",
          },
      device: {
        prepare_inputs_embeds: "wasm", // TODO use "webgpu" when bug is fixed
        language_model: "webgpu",
        lm_head: "webgpu",
        gen_head: "webgpu",
        gen_img_embeds: "webgpu",
        image_decode: "webgpu",
      },
    })

finalstack · 2024-11-28T08:05:36Z

It's working now, thanks, @xenova! Much appreciated.
I will wait for the bug to be fixed.

finalstack · 2024-11-28T13:49:34Z

Is it possible to pass the following parameters to the generate_images function: width, height, cfg_weight, and parallel_size?

xenova · 2024-11-28T13:52:39Z

Do you have an example for how to do this in the python library? My understanding is that the model generates exactly 576 tokens, so unless the image decoder can produce higher quality images, it can currently only generate 384x384 images.

finalstack · 2024-11-28T14:02:10Z

My bad, you're right—it's 384x384 after checking their docs. But what about guidance and parallel_size?

xenova · 2024-11-28T14:15:55Z

You should be able to pass e.g., guidance_scale: 4to thegenerate_images` function. Batched generation technically works, but needs a bit more experimentation.

finalstack added the bug Something isn't working label Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Janus-1.3B-ONNX - Can't create a session. Failed to allocate a buffer #1056

Janus-1.3B-ONNX - Can't create a session. Failed to allocate a buffer #1056

finalstack commented Nov 27, 2024

xenova commented Nov 28, 2024

finalstack commented Nov 28, 2024

finalstack commented Nov 28, 2024

xenova commented Nov 28, 2024

finalstack commented Nov 28, 2024

xenova commented Nov 28, 2024 •

edited

Loading

Janus-1.3B-ONNX - Can't create a session. Failed to allocate a buffer #1056

Janus-1.3B-ONNX - Can't create a session. Failed to allocate a buffer #1056

Comments

finalstack commented Nov 27, 2024

System Info

Environment/Platform

Description

Reproduction

xenova commented Nov 28, 2024

finalstack commented Nov 28, 2024

finalstack commented Nov 28, 2024

xenova commented Nov 28, 2024

finalstack commented Nov 28, 2024

xenova commented Nov 28, 2024 • edited Loading

xenova commented Nov 28, 2024 •

edited

Loading