[Docs]Add recipe for GLM-Image on 2x A800 GPUs and 1x A800 GPU#2950
[Docs]Add recipe for GLM-Image on 2x A800 GPUs and 1x A800 GPU#2950nainiu258 wants to merge 8 commits into
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
I just merged #2320, you can test it locally again and paste test results |
Seems like nothing changed in my end-to-end case: 2x A800: And here's the output of DiffusionPipelineProfiler : GlmImagePipeline.text_encoder.forward took 0.008926s The result on 1x A800 is the same |
why there is no ar part time? the glm-image first does understanding and then does image generation. cc @JaredforReal |
|
check #2834 |
Got it! |
|
stage0(AR) takes ~25s, stage1(Diffusion) takes ~33s. on 2x A800 and 1x A800 is almost the same |
hsliuustc0106
left a comment
There was a problem hiding this comment.
BLOCKING:
- Gate Check — DCO is failing. Please sign off your commits before proceeding.
|
@nainiu258 @hsliuustc0106 I use |
Signed-off-by: nainiu258 <cperfect02@163.com>
Signed-off-by: nainiu258 <cperfect02@163.com> Signed-off-by: nainiu258 <cperfect02@163.com>
Signed-off-by: nainiu258 <cperfect02@163.com>
Signed-off-by: nainiu258 <cperfect02@163.com>
49de5ce to
47bfea4
Compare
fixed |
Signed-off-by: nainiu258 <cperfect02@163.com>
hsliuustc0106
left a comment
There was a problem hiding this comment.
please check #2977
| Overall summary from the run’s metrics. Rough wall-time split: **Stage 0 (AR)** ~**25 s**, | ||
| **Stage 1 (diffusion)** ~**34 s** (see `e2e_stage_*_wall_time_ms` below). | ||
|
|
||
| | Field | Value | |
There was a problem hiding this comment.
there is some problem with this metrics, cc @bjf-frz could you fix it asap?
Signed-off-by: nainiu258 <cperfect02@163.com>
Signed-off-by: nainiu258 <cperfect02@163.com>
|
|
||
| - Upstream or canonical docs: | ||
| [`docs/user_guide/examples/online_serving/glm_image.md`](../../docs/user_guide/examples/online_serving/glm_image.md) | ||
| - Related example under `examples/`: |
There was a problem hiding this comment.
/docs/user_guide/examples/offline_inference/glm_image.md
can we replace with this?
| vllm serve zai-org/GLM-Image \ | ||
| --omni \ | ||
| --port 8091 \ | ||
| --stage-configs-path vllm_omni/deploy/glm_image_single_gpu.yaml |
There was a problem hiding this comment.
we will deprecate --stage-config-path and turn to --deploy-config, --stage-overrides, double check thanks
JaredforReal
left a comment
There was a problem hiding this comment.
seems like there are gaps after config refactoring, PTAL
could you please tell how can we run an example after server is ready? |
|
i2i curl example jq -n --rawfile img <(base64 -i land.png | tr -d '\n') '{
"messages": [
{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": ("data:image/png;base64," + $img)}},
{"type": "text", "text": "make it cartoon style"}
],
"modalities": ["image"],
}
],
"extra_body": {
"height": 1024,
"width": 1024,
"num_inference_steps": 50,
"true_cfg_scale": 4.0,
"seed": 42
}
}' | curl -s http://172.18.67.228:8091/v1/chat/completions \
-H "Content-Type: application/json" \
-d @- | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > land-cartoon.pngt2i curl example curl -s http://172.18.69.133:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "A beautiful landscape painting"}
],
"extra_body": {
"height": 1920,
"width": 1920,
"num_inference_steps": 50,
"true_cfg_scale": 1.5,
"seed": 42
}
}' | jq -r '.choices[0].message.content[0].image_url.url' | cut -d',' -f2- | base64 -d > land.pngchange the host:port and input/output file path accordingly @nainiu258 |
|
@nainiu258 i2i online serving got some |
|
@nainiu258 GLM-Image is working fine in vllm-omni, can u pull the latest commit and give it another try? Thanks! |
got it |
Signed-off-by: nainiu258 <cperfect02@163.com>
|
@JaredforReal there is something wrong with the arguments --stage-0-devices and --stage-1-devices |
|
@nainiu258 maybe u should just use |
just found it in docs/user_guide/examples/online_serving/glm_image.md |
|
@nainiu258, the user guide is outdated after a lot of refactoring. I will work on it |
@nainiu258 Let's see if you can continue working on this GLM-Image recipe with these help writing, and hopefully we can update those outdated documentations too! |
Summary
Adds a community recipe for serving Z.ai GLM-Image with vLLM-Omni: text-to-image (T2I) via the OpenAI-compatible online API, including 1× and 2× NVIDIA A800 80GB deployment notes and links to the canonical user guide and
examples/online_serving/glm_imageclients.Changes
recipes/GLM/GLM-Image.mdGLM/GLM-Image(aligned with HF-style naming in the recipe).recipes/README.mdGLM/GLM-Image.mdfor 1×/2× A800 80GB image generation.