fix bug which causes the following RuntimeError when call apply_weights… by yaomingamd · Pull Request #2 · ROCm/vllm

yaomingamd · 2024-02-23T13:16:08Z

… with bias tensor

vllm/model_executor/layers/linear.py", line 70, in apply_weights
if bias:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

…th bias tensor vllm/model_executor/layers/linear.py", line 70, in apply_weights if bias: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

Enable FP8 E4M3 KV Cache

Vllm mi300 v4

wip wip & debug update cleanup use quark realquantizer for pack/quant/dequant comment on cudagraph issue; remove prints Keep only 1 place importing quark cudagraph issue resolved; dq weight at load time for efficiency Signed-off-by: Bowen Bao <bowenbao@amd.com> lint Signed-off-by: Bowen Bao <bowenbao@amd.com> turn on emulation based on platform Signed-off-by: Bowen Bao <bowenbao@amd.com> add fused moe support - ugly wip running version Add envar if dequant weight at load time Signed-off-by: Bowen Bao <bowenbao@amd.com> Mxfp4 memory leak fixes (#2)

wip wip & debug update cleanup use quark realquantizer for pack/quant/dequant comment on cudagraph issue; remove prints Keep only 1 place importing quark cudagraph issue resolved; dq weight at load time for efficiency Signed-off-by: Bowen Bao <bowenbao@amd.com> lint Signed-off-by: Bowen Bao <bowenbao@amd.com> turn on emulation based on platform Signed-off-by: Bowen Bao <bowenbao@amd.com> add fused moe support - ugly wip running version Add envar if dequant weight at load time Signed-off-by: Bowen Bao <bowenbao@amd.com> Mxfp4 memory leak fixes (#2) Signed-off-by: Felix Marty <felmarty@amd.com>

wip & debug update cleanup use quark realquantizer for pack/quant/dequant comment on cudagraph issue; remove prints Keep only 1 place importing quark cudagraph issue resolved; dq weight at load time for efficiency Signed-off-by: Bowen Bao <bowenbao@amd.com> lint Signed-off-by: Bowen Bao <bowenbao@amd.com> turn on emulation based on platform Signed-off-by: Bowen Bao <bowenbao@amd.com> add fused moe support - ugly wip running version Add envar if dequant weight at load time Signed-off-by: Bowen Bao <bowenbao@amd.com> Mxfp4 memory leak fixes (#2) Fix VLLM_QUARK_EMU_MEM_OPT route Signed-off-by: Felix Marty <felmarty@amd.com>

…ect#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

…3058) Signed-off-by: ramos <49182011+nemoramo@users.noreply.github.com> Signed-off-by: mayufeng <mayufeng@example.com> Co-authored-by: mayufeng <mayufeng@example.com>

fix bug causing the following RuntimeError when call apply_weights wi…

607406b

…th bias tensor vllm/model_executor/layers/linear.py", line 70, in apply_weights if bias: RuntimeError: Boolean value of Tensor with more than one value is ambiguous

hliuca self-assigned this Feb 23, 2024

hliuca merged commit c6dc9da into ROCm:0.3.0-rocm Feb 23, 2024

AdrianAbeyta pushed a commit that referenced this pull request Mar 8, 2024

Merge pull request #2 from ROCm/fp8-e4m3-kvcache-rocm

4a0d880

Enable FP8 E4M3 KV Cache

gshtras pushed a commit that referenced this pull request Sep 27, 2024

Merge pull request #2 from ROCmSoftwarePlatform/vllm_mi300_v4

280c345

Vllm mi300 v4

japarada mentioned this pull request Feb 3, 2025

[Bug]: Running Llama-2-70b inference on MI300x getting OOM #397

Closed

1 task

charlifu pushed a commit that referenced this pull request Oct 9, 2025

[Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-proj…

bb6d8c2

…ect#26445) Signed-off-by: Nick Hill <nhill@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix bug which causes the following RuntimeError when call apply_weights…#2

fix bug which causes the following RuntimeError when call apply_weights…#2
hliuca merged 1 commit intoROCm:0.3.0-rocmfrom
yaomingamd:0.3.0-rocm

yaomingamd commented Feb 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yaomingamd commented Feb 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants